03-17-2013, 10:52 AM | #1 |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
Bug in web/feeds/__init__ processing feed article date
There is a problem in parsing RSS feeds where the article dates are not being processed properly. In the routine parse_article the article date is being extracted as item.get('date_parsed', time.gmtime()) which is returning the current time because the feed/article key 'date_parsed' is not found.
In two major RSS feeds (NYTimes and Globe and Mail) the feed/article date key is 'published_parsed'. I realize there are variants of RSS feed formats but I believe (based on looking at feedparser.py) that 'published_parsed' is a valid, non-deprecated key. As a result, any RSS feed-based recipes that use feeds with the 'published_parsed' key are not obeying oldest_article restrictions. I'm not sure what the best fix is--probably to extract using 'published_parsed' if the 'date_parsed' key isn't there (or vice versa) to avoid breaking feeds that use 'date_parsed'. |
03-17-2013, 06:51 PM | #2 |
Enthusiast
Posts: 32
Karma: 10
Join Date: Apr 2012
Device: Amazon Kindle Paperwhite
|
Could this be the same issue that is causing the oldest article function not to skip old articles in basic recipes? The issue started with Calibre 0.9.23.
For example, I have a basic recipe to pull RSS feeds for betanews.com. The fields in this basic recipe are as follows (to recreate this issue you will need to create a basic recipe and enter these fields). class BasicUserRecipe1363558652(AutomaticNewsRecipe): title = u'Beta News' oldest_article = 1 max_articles_per_feed = 100 auto_cleanup = True feeds = [(u'Top Stories', u'http://feeds.betanews.com/bn')] Using Calibre 0.9.21 this basic recipe correctly returns only articles posted in the last 1 day. Using Calibre 0.9.23, this recipe returns all articles on "http://feeds.betanews.com/bn" page. The oldest is currently 4 days old. The same issue is happening with my other basic recipes. All were running correctly until I installed 0.9.23. |
Advert | |
|
03-17-2013, 06:59 PM | #3 |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
^I don't know when this problem arose but yes, quite likely this is why.
|
03-18-2013, 01:48 AM | #5 |
creator of calibre
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
|
Advert | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Bug in Kobo processing of epub files causing hang in "Processing content" | BensonBear | Kobo Reader | 21 | 12-21-2012 05:47 AM |
Help: When Article is Feed? | _reader | Recipes | 2 | 06-14-2012 03:53 PM |
RSS feed with date in url | entodoays | Recipes | 0 | 10-22-2011 04:24 PM |
Create Article Sections From Feed? | Finbar127 | Recipes | 5 | 02-26-2011 08:55 AM |
Web Standards for E-books by Joe Clark (web article) | guyanonymous | General Discussions | 2 | 03-18-2010 10:36 PM |