MobileRead Forums - View Single Post - Missing Articles when Scheduling NYT Download

bcollier · 02-14-2011, 11:39 AM

I've found many times if I set the NYT subscription recipe to download at a certain time, I am often missing many of the front page articles, but if I am there and hit download now the articles are there.

I tested this with the debug on, the scheduled download for 7am was missing all but 2 front page articles, then I manually hit download at 7:20 and all the articles were there.

It appears as though this problem is related to pop up ads or ads of some kind, but I've never been able to see them in person. The logs are attached, both the log of the run with missing articles, and the log with all the articles, and key section of the log pasted below, any thoughts on how to fix this?

Fetching http://www.nytimes.com/2011/02/14/bu...pagewanted=all
Fetching http://www.nytimes.com/2011/02/14/wo...pagewanted=all
Fetching http://www.nytimes.com/2011/02/14/ny...pagewanted=all
1% Starting download [5 thread(s)]...
Found forwarding link: /2011/02/14/us/14giffords.html?adxnnl=1&pagewanted=all&adxnnlx=12 97684807-9QlfX3WHQH98rsuu9txtHg
Skipping ad to article at 'http://www.nytimes.com/2011/02/14/us/14giffords.html?pagewanted=all'
Found forwarding link: /2011/02/14/world/middleeast/14egypt-tunisia-protests.html?adxnnl=1&pagewanted=all&adxnnlx=1297 684807-9QlfX3WHQH98rsuu9txtHg
Skipping ad to article at 'http://www.nytimes.com/2011/02/14/world/middleeast/14egypt-tunisia-protests.html?pagewanted=all'
Found forwarding link: /2011/02/14/nyregion/14chicken.html?adxnnl=1&pagewanted=all&adxnnlx=129 7684807-9QlfX3WHQH98rsuu9txtHg
Skipping ad to article at 'http://www.nytimes.com/2011/02/14/nyregion/14chicken.html?pagewanted=all'
Found forwarding link: /2011/02/14/business/14retirees.html?adxnnl=1&pagewanted=all&adxnnlx=12 97684807-9QlfX3WHQH98rsuu9txtHg
Skipping ad to article at 'http://www.nytimes.com/2011/02/14/business/14retirees.html?pagewanted=all'
Found forwarding link: /2011/02/14/world/middleeast/14egypt.html?adxnnl=1&pagewanted=all&adxnnlx=12976 84807-9QlfX3WHQH98rsuu9txtHg
Skipping ad to article at 'http://www.nytimes.com/2011/02/14/world/middleeast/14egypt.html?pagewanted=all'
Could not fetch link http://www.nytimes.com/2011/02/14/us...pagewanted=all
Traceback (most recent call last):
File "site-packages\calibre\web\fetch\simple.py", line 428, in process_links
File "site-packages\calibre\web\fetch\simple.py", line 153, in get_soup
File "site-packages\calibre\ebooks\chardet\__init__.py", line 86, in xml_to_unicode
TypeError: 'NoneType' object is not callable

http://www.nytimes.com/2011/02/14/us...pagewanted=all saved to
Downloading
Fetching http://www.nytimes.com/2011/02/14/bu...pagewanted=all
Failed to download article: Word and Lyric, Giffords Labors to Speak Again from http://www.nytimes.com/2011/02/14/us...pagewanted=all
Traceback (most recent call last):
File "site-packages\calibre\utils\threadpool.py", line 95, in run
File "site-packages\calibre\web\feeds\news.py", line 852, in fetch_article
File "site-packages\calibre\web\feeds\news.py", line 847, in _fetch_article
Exception: Could not fetch article. The debug traceback is available earlier in this log

1% Article download failed: u'Word and Lyric, Giffords Labors to Speak Again'
Could not fetch link http://www.nytimes.com/2011/02/14/ny...pagewanted=all
Traceback (most recent call last):
File "site-packages\calibre\web\fetch\simple.py", line 428, in process_links
File "site-packages\calibre\web\fetch\simple.py", line 153, in get_soup
File "site-packages\calibre\ebooks\chardet\__init__.py", line 86, in xml_to_unicode
TypeError: 'NoneType' object is not callable