Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 02-14-2011, 10:39 AM   #1
bcollier
Member
bcollier began at the beginning.
 
bcollier's Avatar
 
Posts: 22
Karma: 10
Join Date: Jan 2011
Device: Kindle DX
Missing Articles when Scheduling NYT Download

I've found many times if I set the NYT subscription recipe to download at a certain time, I am often missing many of the front page articles, but if I am there and hit download now the articles are there.

I tested this with the debug on, the scheduled download for 7am was missing all but 2 front page articles, then I manually hit download at 7:20 and all the articles were there.

It appears as though this problem is related to pop up ads or ads of some kind, but I've never been able to see them in person. The logs are attached, both the log of the run with missing articles, and the log with all the articles, and key section of the log pasted below, any thoughts on how to fix this?

Fetching http://www.nytimes.com/2011/02/14/bu...pagewanted=all
Fetching http://www.nytimes.com/2011/02/14/wo...pagewanted=all
Fetching http://www.nytimes.com/2011/02/14/ny...pagewanted=all
1% Starting download [5 thread(s)]...
Found forwarding link: /2011/02/14/us/14giffords.html?adxnnl=1&pagewanted=all&adxnnlx=12 97684807-9QlfX3WHQH98rsuu9txtHg
Skipping ad to article at 'http://www.nytimes.com/2011/02/14/us/14giffords.html?pagewanted=all'
Found forwarding link: /2011/02/14/world/middleeast/14egypt-tunisia-protests.html?adxnnl=1&pagewanted=all&adxnnlx=1297 684807-9QlfX3WHQH98rsuu9txtHg
Skipping ad to article at 'http://www.nytimes.com/2011/02/14/world/middleeast/14egypt-tunisia-protests.html?pagewanted=all'
Found forwarding link: /2011/02/14/nyregion/14chicken.html?adxnnl=1&pagewanted=all&adxnnlx=129 7684807-9QlfX3WHQH98rsuu9txtHg
Skipping ad to article at 'http://www.nytimes.com/2011/02/14/nyregion/14chicken.html?pagewanted=all'
Found forwarding link: /2011/02/14/business/14retirees.html?adxnnl=1&pagewanted=all&adxnnlx=12 97684807-9QlfX3WHQH98rsuu9txtHg
Skipping ad to article at 'http://www.nytimes.com/2011/02/14/business/14retirees.html?pagewanted=all'
Found forwarding link: /2011/02/14/world/middleeast/14egypt.html?adxnnl=1&pagewanted=all&adxnnlx=12976 84807-9QlfX3WHQH98rsuu9txtHg
Skipping ad to article at 'http://www.nytimes.com/2011/02/14/world/middleeast/14egypt.html?pagewanted=all'
Could not fetch link http://www.nytimes.com/2011/02/14/us...pagewanted=all
Traceback (most recent call last):
File "site-packages\calibre\web\fetch\simple.py", line 428, in process_links
File "site-packages\calibre\web\fetch\simple.py", line 153, in get_soup
File "site-packages\calibre\ebooks\chardet\__init__.py", line 86, in xml_to_unicode
TypeError: 'NoneType' object is not callable

http://www.nytimes.com/2011/02/14/us...pagewanted=all saved to
Downloading
Fetching http://www.nytimes.com/2011/02/14/bu...pagewanted=all
Failed to download article: Word and Lyric, Giffords Labors to Speak Again from http://www.nytimes.com/2011/02/14/us...pagewanted=all
Traceback (most recent call last):
File "site-packages\calibre\utils\threadpool.py", line 95, in run
File "site-packages\calibre\web\feeds\news.py", line 852, in fetch_article
File "site-packages\calibre\web\feeds\news.py", line 847, in _fetch_article
Exception: Could not fetch article. The debug traceback is available earlier in this log



1% Article download failed: u'Word and Lyric, Giffords Labors to Speak Again'
Could not fetch link http://www.nytimes.com/2011/02/14/ny...pagewanted=all
Traceback (most recent call last):
File "site-packages\calibre\web\fetch\simple.py", line 428, in process_links
File "site-packages\calibre\web\fetch\simple.py", line 153, in get_soup
File "site-packages\calibre\ebooks\chardet\__init__.py", line 86, in xml_to_unicode
TypeError: 'NoneType' object is not callable
Attached Files
File Type: txt gettodayspaper_debug 702am.log.txt (238.7 KB, 250 views)
File Type: txt gettodayspaper_debug 720am.log.txt (236.8 KB, 252 views)
bcollier is offline   Reply With Quote
Old 02-14-2011, 10:45 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,319
Karma: 27111242
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
It may be that the articles are not available at 7:00 AM. Try setting your scheduler for 7:20 instead.
kovidgoyal is offline   Reply With Quote
Advert
Old 02-21-2011, 12:19 PM   #3
mkgtu
Zealot
mkgtu is generous with chocolatemkgtu is generous with chocolatemkgtu is generous with chocolatemkgtu is generous with chocolatemkgtu is generous with chocolatemkgtu is generous with chocolatemkgtu is generous with chocolatemkgtu is generous with chocolatemkgtu is generous with chocolatemkgtu is generous with chocolatemkgtu is generous with chocolate
 
Posts: 139
Karma: 33000
Join Date: Feb 2010
Device: Currently:Voyage, Oasis 3, Kindle mobile apps, andKindle Fire
I have had the same problem with front page articles, though the time frame is different. (Though not sure what timezone you're in) Always used to be able to get everything at 4:00 am (Pacific), never downloaded anything as late as 7:00. E.g. today got only one front page article at 4:00, but manually downloaded NYT at about 6:30 and got full front page.
Am now loading two copies of the recipe, one to download at 5:00 am, the second at 5:45, just to see what happens.

It's strange. The paid Amazon Kindle version of the paper appears quite early on the West Coast, at the same time it would hit the streets in New York. I don't wake up at 4:00 am, but it has certainly been there at 5:00 am (8:00 Eastern time). So those articles must be available someplace. And they were always available through calibre scheduled downdloads as early as 3:40 am (Pacific) a few months ago!

At one time I had thought that the problem had something to do with the "duplicates" variable. IOW that if any "International" article were on the Front Page of the paper, it would only appear in the "International" section, but not be repeated on the Front Page. But alas! Not so. The actual front page articles seem to be missing from the entire paper!
mkgtu is offline   Reply With Quote
Old 02-21-2011, 01:30 PM   #4
bcollier
Member
bcollier began at the beginning.
 
bcollier's Avatar
 
Posts: 22
Karma: 10
Join Date: Jan 2011
Device: Kindle DX
Interesting, I don't think I have a simple answer to the problem, I've tried a bunch of scenarios as well. Scheduling for 7:30am Eastern Time has worked well this week with all the front page articles available. However, on Sunday I was up at 5:30 and checked the website and all the articles were up and available, so I'm not sure why they wouldn't download the rest of the week. Strange.
bcollier is offline   Reply With Quote
Old 06-03-2011, 09:50 PM   #5
dlgraves
Junior Member
dlgraves began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Jun 2011
Device: Kindle
Hello,

Just wondering if this was ever resolved. I've been having the same problem: When i dl manually through the GUI, all the NYT articles are there, but when I run a scheduled download (via command line script) the Front Page section usually only has one or two. I get a slimmed down version to save space, but the same missing articles problem happens whether I use "exclude" or "include" to customize the script.

Anyway -- calibre is amazing, the NYT recipe is great, I wish I could just get this one kink ironed out. Any help appreciated!!

thanks
DLG
dlgraves is offline   Reply With Quote
Advert
Old 06-03-2011, 10:04 PM   #6
bcollier
Member
bcollier began at the beginning.
 
bcollier's Avatar
 
Posts: 22
Karma: 10
Join Date: Jan 2011
Device: Kindle DX
I was able to resolve this problem in the recipe before, but sounds like it is a problem again. I don't have the paid NYT online subscription so I can't troubleshoot the problem.
bcollier is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
(another) FIX: New York Times Missing Articles bcollier Recipes 11 02-11-2011 03:16 PM
New user question re scheduling a news download Greggywocky Calibre 15 11-01-2010 06:09 PM
Calibre + NYT download = inconsistent? maxbookworm Calibre 36 06-28-2010 07:30 AM
Missing covers, missing content. Getting worse with each sync. Mememememe Kobo Reader 7 06-16-2010 09:02 AM
NYT Articles = Fraud Alexander Turcic Lounge 2 05-24-2003 02:32 PM


All times are GMT -4. The time now is 09:13 AM.


MobileRead.com is a privately owned, operated and funded community.