Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 09-20-2012, 06:05 PM   #1
squigish
Junior Member
squigish began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Jan 2012
Device: Kindle 3, Kindle Touch
Nytimes web recipe intermittent 404 error

I've got a crontab job on an ubuntu server set up to download the nytimes web edition every hour and put it on a web server where I can download it to my kindle.

Almost always (about 20-23 times a day, out of the 24 times it gets run), calibre generates a 404 error and doesn't generate the file. Here's what I'm running, and the output I get:

Code:
$ /u1/myhome/bin/ebook-convert /u1/myhome/nytimes/nytimes_sub.web.recipe /u1/myhome/pub_http_internet/nytimes/nytimes-web-test.mobi --username myuser --password mypass --output-profile kindle
1% Converting input to HTML...
InputFormatPlugin: Recipe Input running
1% Fetching feeds...
Index URL: http://www.nytimes.com/pages/world/index.html
Index URL: http://www.nytimes.com/pages/national/index.html
Index URL: http://www.nytimes.com/pages/politics/index.html
Index URL: http://www.nytimes.com/pages/nyregion/index.html
Index URL: http://www.nytimes.com/pages/business/index.html
Index URL: http://www.nytimes.com/pages/technology/index.html
Index URL: http://www.nytimes.com/pages/science/index.html
Index URL: http://www.nytimes.com/pages/health/index.html
Index URL: http://www.nytimes.com/pages/opinion/index.html
Index URL: http://www.nytimes.com/pages/arts/index.html
Index URL: http://www.nytimes.com/pages/books/index.html
Index URL: http://www.nytimes.com/pages/movies/index.html
Index URL: http://www.nytimes.com/pages/arts/music/index.html
Index URL: http://www.nytimes.com/pages/arts/television/index.html
Index URL: http://www.nytimes.com/pages/dining/index.html
Index URL: http://www.nytimes.com/pages/travel/index.html
Index URL: http://www.nytimes.com/pages/education/index.html
Index URL: http://www.nytimes.com/pages/magazine/index.html
Index URL: http://www.nytimes.com/pages/weekinreview/index.html
Traceback (most recent call last):
  File "site.py", line 58, in main
  File "site-packages/calibre/ebooks/conversion/cli.py", line 325, in main
  File "site-packages/calibre/ebooks/conversion/plumber.py", line 979, in run
  File "site-packages/calibre/customize/conversion.py", line 208, in __call__
  File "site-packages/calibre/ebooks/conversion/plugins/recipe_input.py", line 105, in convert
  File "site-packages/calibre/web/feeds/news.py", line 881, in download
  File "site-packages/calibre/web/feeds/news.py", line 1025, in build_index
  File "<string>", line 582, in parse_index
  File "<string>", line 455, in parse_web_edition
  File "<string>", line 379, in index_to_soup
  File "<string>", line 362, in get_the_soup
  File "site-packages/mechanize/_mechanize.py", line 199, in open_novisit
  File "site-packages/mechanize/_mechanize.py", line 255, in _mech_open
httperror_seek_wrapper: HTTP Error 404: Not found
The only diffs between nytimes_sub.web.recipe and the nytimes_sub recipe I downloaded today from launchpad are the following:
Code:
    webEdition = False					      |	    webEdition = True
    oldest_article = 7					      |	    oldest_article = 2
    useHighResImages = True				      |	    useHighResImages = False
    excludeSections = []				      |	    excludeSections = ['Sports']
                    (u'Sports',u'sports'),		      <
                    (u'Style',u'style'),		      <
                    (u'Fashion & Style',u'fashion'),	      <
                    (u'Home & Garden',u'garden'),	      <
                    ('Multimedia',u'multimedia'),	      <
                    (u'Obituaries',u'obituaries'),	      <
(for those of you who don't speak diff, all I changed was to turn on the WebEdition, and exclude some sections, old articles and high-res images.)

Has anyone else been experiencing similar problems?
squigish is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
NYTimes recipe skipping articles jzall Recipes 3 12-03-2012 11:06 PM
Touch Kobo/Adobe not working--error 404 when I try to get the pubs--Server down? tea2 Kobo Reader 4 07-02-2012 09:02 AM
How do I suppress Stack Traces on HTTP 404 Error in the web browser? katsu Development 3 11-06-2011 11:25 PM
Import failed Error:404 when attempting to import from Calibre to Stanza dvond Apple Devices 0 05-13-2011 03:00 PM
Truncation of the NYTimes Headlines recipe Nanoox Recipes 7 03-05-2011 10:49 PM


All times are GMT -4. The time now is 11:11 AM.


MobileRead.com is a privately owned, operated and funded community.