01-26-2010, 03:56 PM | #1 |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
Washington Post news feed crash
The Washington Post RSS feeds have some issues which involve empty articles in the index, causing the parser to crash. I've opened a ticket, and in the mean time the following code added to the recipe will work around the issue.
Code:
def preprocess_html(self, soup): for tag in soup.findAll('font'): if tag['size']: if tag['size'] == '+2': if tag.b: return soup return None |
01-29-2010, 06:33 PM | #2 |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
Kovid - thaks for the response to the ticket--since it is an XHTML defect in the Washington Post feed, perhaps you should put the workaround in the standard recipe (I've tested it and it works).
|
Advert | |
|
01-29-2010, 06:40 PM | #3 |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
Actually, the following is safer since the test tag['size'] doesn't work if there is no attribute 'size'.
Code:
def preprocess_html(self, soup): for tag in soup.findAll('font'): if tag.has_key('size'): if tag['size'] == '+2': if tag.b: return soup return None |
01-29-2010, 08:25 PM | #4 |
creator of calibre
Posts: 43,859
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
done.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Two different formats on news feed | TonytheBookworm | Calibre | 3 | 10-06-2010 11:51 AM |
News Feed Covers | DenverReader | Calibre | 4 | 02-06-2010 12:00 AM |
News feed scheduling | nickredding | Calibre | 1 | 01-24-2010 07:28 PM |
libprs500 News feeds Crash Reader | vinniet | Calibre | 17 | 12-02-2009 02:46 PM |
News feed error | thibaulthalpern | Calibre | 4 | 03-22-2009 02:21 AM |