View Single Post
Old 12-01-2012, 09:49 AM   #2
BobbyVan
Enthusiast
BobbyVan began at the beginning.
 
Posts: 42
Karma: 20
Join Date: Jan 2012
Device: Kindle Paperwhite
Quote:
Originally Posted by jzall View Post
The NY Times recipe seems to be skipping articles lately. In today's Front Page section (9 Oct), for example, there are 6 articles in that section on the website, but only 4 show up in the Calibre download. Does anyone have any suggestions for fixing this? Thanks.
This has been a periodic problem for me for quite a while now. Today only one article is showing up in the recipe, whereas the website (http://www.nytimes.com/pages/todayspaper/index.html) is showing 6 front page articles. It appears that this happens because the "Front Page" section is formatted differently than the other sections (including each "lede" as well as links to comment sections).

Here's a snippet from my log file for a few of the Front Page articles from today that failed to download:

Quote:
http://www.nytimes.com/2012/12/01/bu...pagewanted=all
Downloading
Fetching http://www.nytimes.com/2012/12/01/us...pagewanted=all
Fetching http://www.nytimes.com/2012/12/01/bu...pagewanted=all
Fetching http://www.nytimes.com/2012/12/01/wo...pagewanted=all
Found forwarding link: /2012/12/01/business/a-hospital-war-reflects-a-tightening-bind-for-doctors-nationwide.html?adxnnl=1&pagewanted=all&adxnnlx=13 54374508-I536HSB8AoTB7E+3dlGiDg
Skipping ad to article at 'http://www.nytimes.com/2012/12/01/business/a-hospital-war-reflects-a-tightening-bind-for-doctors-nationwide.html?pagewanted=all'
Found forwarding link: /2012/12/01/us/dream-act-gives-young-immigrants-a-political-voice.html?adxnnl=1&pagewanted=all&adxnnlx=1354374 508-I536HSB8AoTB7E+3dlGiDg
Skipping ad to article at 'http://www.nytimes.com/2012/12/01/us/dream-act-gives-young-immigrants-a-political-voice.html?pagewanted=all'
Found forwarding link: /2012/12/01/world/middleeast/israel-moves-to-expand-settlements-in-east-jerusalem.html?adxnnl=1&pagewanted=all&adxnnlx=135 4374427-+mg8fvhpducBZ7vP0d08XA
Found forwarding link: /2012/12/01/world/africa/south-africa-corruption-fuels-battle-for-political-spoils.html?adxnnl=1&pagewanted=all&adxnnlx=135437 4507-TM0xVHzZl0ftY2tWlwhdlA
Found forwarding link: /2012/12/01/business/online-retailers-rush-to-adjust-prices-in-real-time.html?adxnnl=1&pagewanted=all&adxnnlx=13543745 08-I536HSB8AoTB7E+3dlGiDg
Skipping ad to article at 'http://www.nytimes.com/2012/12/01/world/middleeast/israel-moves-to-expand-settlements-in-east-jerusalem.html?pagewanted=all'
Skipping ad to article at 'http://www.nytimes.com/2012/12/01/business/online-retailers-rush-to-adjust-prices-in-real-time.html?pagewanted=all'Skipping ad to article at 'http://www.nytimes.com/2012/12/01/world/africa/south-africa-corruption-fuels-battle-for-political-spoils.html?pagewanted=all'

Could not fetch link http://www.nytimes.com/2012/12/01/bu...pagewanted=all
Traceback (most recent call last):
File "site-packages\calibre\web\fetch\simple.py", line 474, in process_links
File "site-packages\calibre\web\fetch\simple.py", line 163, in get_soup
File "site-packages\calibre\ebooks\chardet.py", line 109, in xml_to_unicode
File "site-packages\calibre\ebooks\chardet.py", line 73, in detect_xml_encoding
TypeError: 'NoneType' object is not callable

http://www.nytimes.com/2012/12/01/bu...pagewanted=all saved to
Downloading
Fetching http://www.nytimes.com/2012/12/01/sp...pagewanted=all
Could not fetch link http://www.nytimes.com/2012/12/01/wo...pagewanted=all
Traceback (most recent call last):
File "site-packages\calibre\web\fetch\simple.py", line 474, in process_links
File "site-packages\calibre\web\fetch\simple.py", line 163, in get_soup
File "site-packages\calibre\ebooks\chardet.py", line 109, in xml_to_unicode
File "site-packages\calibre\ebooks\chardet.py", line 73, in detect_xml_encoding
TypeError: 'NoneType' object is not callable

http://www.nytimes.com/2012/12/01/wo...pagewanted=all saved to
Failed to download article: Retail Frenzy: Prices on the Web Change Hourly from http://www.nytimes.com/2012/12/01/bu...pagewanted=all
Traceback (most recent call last):
File "site-packages\calibre\utils\threadpool.py", line 95, in run
File "site-packages\calibre\web\feeds\news.py", line 1017, in fetch_article
File "site-packages\calibre\web\feeds\news.py", line 1012, in _fetch_article
Exception: Could not fetch article. The debug traceback is available earlier in this log
Thanks to everyone who keeps Calibre running and the recipes working!

Last edited by BobbyVan; 12-01-2012 at 10:19 AM. Reason: added log info
BobbyVan is offline   Reply With Quote