Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 12-17-2011, 06:04 PM   #1
duluoz
Newsbeamer dev
duluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheese
 
Posts: 122
Karma: 1000
Join Date: Dec 2011
Device: Kindle Voyage
Recipe request:

Hi all,

Would anybody be able to help me to create a recipe?

Ideally, we could create a recipe that parses the current issue of the magazine. You need to log in to see this (but I'll PM the log in details if anyone is willing to help).

I'd be very grateful if anyone were willing and able to help!

Thanks
Jamie

Last edited by duluoz; 01-06-2012 at 07:14 PM.
duluoz is offline   Reply With Quote
Old 12-18-2011, 01:14 AM   #2
Barty
doofus
Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.
 
Barty's Avatar
 
Posts: 2,513
Karma: 13036221
Join Date: Sep 2010
Device: Kobo Libra 2, Kindle Voyage
I don't have a login so I'm doing this blind. This may even work

edit: there is an error in parse_index. fixed.

I'm getting a Forbidden error when fetching the link, however. Maybe if you have a login it would work? Cross fingers.
Attached Files
File Type: zip prospectmaguk.zip (1.1 KB, 218 views)

Last edited by Barty; 12-18-2011 at 10:39 AM.
Barty is offline   Reply With Quote
Advert
Old 12-18-2011, 05:18 AM   #3
duluoz
Newsbeamer dev
duluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheese
 
Posts: 122
Karma: 1000
Join Date: Dec 2011
Device: Kindle Voyage
Thanks for having a go Barty - but I'm afraid it didn't work.

The result was just 3 pages - one with the date, one with the issue number, and one blank.

I was wondering whether there was something in parsing links based on the sections in the current issue (Features, Opinion, Science & Tech etc etc).

Thanks again

Jamie
duluoz is offline   Reply With Quote
Old 12-19-2011, 11:01 AM   #4
Barty
doofus
Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.
 
Barty's Avatar
 
Posts: 2,513
Karma: 13036221
Join Date: Sep 2010
Device: Kobo Libra 2, Kindle Voyage
See my edited post above
Barty is offline   Reply With Quote
Old 12-19-2011, 11:48 AM   #5
duluoz
Newsbeamer dev
duluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheese
 
Posts: 122
Karma: 1000
Join Date: Dec 2011
Device: Kindle Voyage
Barty - I appreciate you trying to crack this. Still not working though I'm afraid. It pulls the cover, and the issue number, but no articles. I've posted the pdf it creates if it helps.

https://docs.google.com/open?id=0B0O...FiYTJkNGI5NGZm

I'll also PM a logon - although I think even without it should work, but would just pull less articles.
duluoz is offline   Reply With Quote
Advert
Old 12-19-2011, 05:05 PM   #6
Barty
doofus
Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.
 
Barty's Avatar
 
Posts: 2,513
Karma: 13036221
Join Date: Sep 2010
Device: Kobo Libra 2, Kindle Voyage
Yeah, sorry, I'm stumped. It's parsing the index and getting the title and URL correctly, but fetching the URL gives a forbidden error. Maybe Kovid or someone else can take a look at it.
Barty is offline   Reply With Quote
Old 12-19-2011, 06:19 PM   #7
duluoz
Newsbeamer dev
duluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheese
 
Posts: 122
Karma: 1000
Join Date: Dec 2011
Device: Kindle Voyage
Thanks Barty. I'm looking at the output from running ebook-convert in the command line - as you say, it's correctly finding the article URLs, but then not able to download the articles.

I don't get forbidden errors, just 'Failed to download article [article name]'

EDIT: just checked the debug file, and I also get the failed to d/load article error. Wonder if it's a 403 error somewhere??

Seems very strange.

Last edited by duluoz; 12-19-2011 at 06:59 PM.
duluoz is offline   Reply With Quote
Old 12-19-2011, 10:23 PM   #8
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That likely means that the login failed. Check the result of the submit() in get_browser() to ensure the login actually worked.

html = br.submit().read()
kovidgoyal is offline   Reply With Quote
Old 12-20-2011, 05:07 AM   #9
duluoz
Newsbeamer dev
duluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheese
 
Posts: 122
Karma: 1000
Join Date: Dec 2011
Device: Kindle Voyage
Kovid - thanks for the reply, much appreciated. I'm not sure how to use the code snippet you provided though - where can I access the value of 'html'?

And I'm not convinced it's a login question - I thought perhaps something to do with the user agent? You can access the articles it failed to whether logged in or not.

I posted this as a new thread, with the specific problem. Perhaps you might have a chanc to take a look at the error message?

https://www.mobileread.com/forums/sho...d.php?t=161669

Thanks again
duluoz is offline   Reply With Quote
Old 12-20-2011, 05:10 AM   #10
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The news download system automatically sets the user agent to mimic a browser. The error message is a generic HTTP 403, there's no way to know from it why permission is denied. You can always save the html to a file with

open('path_to_some_file.html', 'wb').write(html)

and open it in a browser/text editor later.
kovidgoyal is offline   Reply With Quote
Old 12-20-2011, 12:03 PM   #11
Barty
doofus
Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.
 
Barty's Avatar
 
Posts: 2,513
Karma: 13036221
Join Date: Sep 2010
Device: Kobo Libra 2, Kindle Voyage
The login appears correct. Output attached.

However, even if I set need_subscription = False and remove get_browser(), I shouldn't get Forbidden but instead a stub page with a summary of the article and prompt to subscribe or login to see full article. At least that is what I get using a regular browser.

So I looked at my browser's network log, and it looks like the page returns 403 even if you have logged in, but the response body contains the full article and looks normal if you are reading with your browser

Code:
URL:	http://www.prospectmagazine.co.uk/2011/12/time-travel/
Method:	GET
Status:	403 Forbidden
Duration:	1751 ms

Request details
GET /2011/12/time-travel/ HTTP/1.1 
User-Agent: Opera/9.80 (Windows NT 6.1; U; Edition United States Local; en) Presto/2.10.229 Version/11.60
Host: www.prospectmagazine.co.uk
Accept: text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/webp, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1
Accept-Language: en-US,en;q
Accept-Encoding: gzip, deflate
Referer: http://www.prospectmagazine.co.uk/issue/190/
Cookie: wordpress_test_cookie=(snip)
Connection: Keep-Alive
Request body

No request data
Response details
HTTP/1.1 403 Forbidden 
Date: Tue, 20 Dec 2011 16:58:08 GMT
Server: Apache
X-Pingback: http://www.prospectmagazine.co.uk/xmlrpc.php
Expires: Wed, 11 Jan 1984 05:00:00 GMT
Cache-Control: no-cache, must-revalidate, max-age=0
Pragma: no-cache
Link: <http://www.prospectmagazine.co.uk/?p=103609>; rel=shortlink
Last-Modified: Tue, 20 Dec 2011 16:58:08 GMT
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
Body
.... full body snipped ...
Attached Files
File Type: zip login.zip (12.4 KB, 206 views)

Last edited by Barty; 12-20-2011 at 03:56 PM.
Barty is offline   Reply With Quote
Old 12-20-2011, 04:44 PM   #12
duluoz
Newsbeamer dev
duluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheese
 
Posts: 122
Karma: 1000
Join Date: Dec 2011
Device: Kindle Voyage
Thanks both - it all seems pretty strange behaviour. Do you think there's no chance then that we could get a working recipe?
duluoz is offline   Reply With Quote
Old 12-20-2011, 09:09 PM   #13
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That's just weird. There's no way to have the news download system handle a website that returns incorrect HTTP codes.
kovidgoyal is offline   Reply With Quote
Old 12-20-2011, 10:32 PM   #14
Barty
doofus
Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.
 
Barty's Avatar
 
Posts: 2,513
Karma: 13036221
Join Date: Sep 2010
Device: Kobo Libra 2, Kindle Voyage
@duluoz: I think you can try contacting the site and letting them know they're sending back a 403 Forbidden response when fetching an article.

Maybe they're doing it on purpose, or maybe it's a bug.
Barty is offline   Reply With Quote
Old 12-21-2011, 05:58 AM   #15
duluoz
Newsbeamer dev
duluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheeseduluoz can extract oil from cheese
 
Posts: 122
Karma: 1000
Join Date: Dec 2011
Device: Kindle Voyage
Thanks both - I sent a note to the webmaster, and got a response already.

Last edited by duluoz; 01-06-2012 at 07:10 PM.
duluoz is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Request: recipe for German magazine fluter.de wgdojocihb9 Recipes 4 06-17-2023 08:42 AM
American Prospect recipe not working davidsmartin Recipes 7 07-20-2012 07:55 PM
http://www.cfo.com/magazine/ recipe request jonathan22 Recipes 0 09-10-2011 02:50 AM
Recipe request - Macleans Magazine canislupus Recipes 7 07-24-2011 08:38 AM
Recipe Request for World Magazine fbrian Recipes 3 06-05-2011 10:10 AM


All times are GMT -4. The time now is 12:36 PM.


MobileRead.com is a privately owned, operated and funded community.