07-21-2010, 03:28 PM | #1 |
Member
Posts: 10
Karma: 10
Join Date: Jul 2010
Device: Currently - Sony PRS-600; Sold - Sony PRS-505
|
BBC News feeds in Calibre
Hi all, I am a recent user of Calibre and have been using it to populate news feeds on my PRS-505.
Most of my feeds have been working fine, however, I noticed the BBC feeds show just the news headlines but when you click on the articles all are blank, with just the URL at the bottom. This is the case both when viewing in Calibre as well as on my e-Reader. I'm running Calibre 0.7.9 and have tried both "The BBC" and "BBC News (Fast)" feeds that come with Calibre. Is anyone else experiencing this as well or is it just a configuration issue on my end? Thanks! |
07-21-2010, 04:36 PM | #2 |
Grand Sorcerer
Posts: 11,939
Karma: 7219261
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
The BBC has recently seen fit to destroy all their news sites. I imagine that the recipe has not yet been adapted to the new 'improved' version.
|
Advert | |
|
07-21-2010, 05:42 PM | #3 |
creator of calibre
Posts: 44,337
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You should open a ticket assigned to darkom who is the author of the BBC recipes. He'll take a look at them when he has the time.
|
07-21-2010, 05:48 PM | #4 |
Member
Posts: 10
Karma: 10
Join Date: Jul 2010
Device: Currently - Sony PRS-600; Sold - Sony PRS-505
|
Chaley, you are right, the BBC does have a 'new look'!
Kovid, will open a ticket, thx. |
07-22-2010, 10:21 AM | #5 |
Addict
Posts: 234
Karma: 6720
Join Date: Aug 2008
Device: SONY PRS505
|
wow, it's been happening for several weeks already, and only now someone notices?
wonder how many people use this recipe |
Advert | |
|
07-22-2010, 10:26 AM | #6 |
creator of calibre
Posts: 44,337
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You can see recipe usage stats here:http://status.calibre-ebook.com/recipe_stats
|
07-22-2010, 02:57 PM | #7 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Here are the two recipes, updated. If someone who used the old version will test them, they can confirm they look correct and pass them to Kovid or post here with problems. This is quick and dirty, but seems OK in my quick tests:
Without images: Spoiler:
With images: Spoiler:
Last edited by Starson17; 07-22-2010 at 04:34 PM. |
07-22-2010, 06:17 PM | #8 |
Member
Posts: 15
Karma: 10
Join Date: Mar 2010
Device: DR800SG
|
BBC (fast)
I just tested the "without images" version (calibre 0.7.9, Win XP SP3).
It's a definite improvement, but there are still lots of articles in the table of contents that prove to be blank. |
07-22-2010, 07:13 PM | #9 |
Member
Posts: 10
Karma: 10
Join Date: Jul 2010
Device: Currently - Sony PRS-600; Sold - Sony PRS-505
|
Starson17
I tried the "with images" version, same result as JRG, still a lot of blank articles but certainly better than the older version. |
07-23-2010, 07:48 AM | #10 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Edit: I just ran the complete version, and looked at 20 articles in various feeds. All had content. I couldn't find any that didn't work. I don't dispute that they're there, but I couldn't find them. Post a link to an empty article and I'll find out how it differs from the working articles. Last edited by Starson17; 07-23-2010 at 09:08 AM. |
|
07-23-2010, 12:58 PM | #11 |
Member
Posts: 10
Karma: 10
Join Date: Jul 2010
Device: Currently - Sony PRS-600; Sold - Sony PRS-505
|
Starson17, thanks for reviewing and yes there are very few blank pages.
I also ran with the --test option, in addition I also had the -vv option for verbose output. That resulted in a few parsing errors, they are in the attached bbc-errors.txt file. One of the errors corresponds to the article UK->"PM's use of crime figures 'is propaganda'" Initial parse failed: Parsing file 'feed_10/article_35/index.html' as HTML Forcing feed_10/article_35/index.html into XHTML namespace Forcing index.html into XHTML namespace A few more are in the Health Section: Iraq veteran's struggle with PTSD [Thu, 22 Jul 02:51] I've uploaded my entire archive to Dropbox here:Vital care lacking for mini-strokes [Thu, 22 Jul 04:19] Ex-conjoined twins reunited with mother [Thu, 22 Jul 13:13] Drugs robots help out hospital staff [Wed, 21 Jul 13:11] http://dl.dropbox.com/u/1925993/calibre/bbc.zip Last edited by elixir; 07-23-2010 at 01:24 PM. Reason: added more description |
07-23-2010, 02:15 PM | #12 | ||
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Quote:
I'm running some tests and will see if I can work around the parsing problem. Somewhere I read that newer versions of Beautiful soup use an HTML parser that is less tolerant of malformed code than older versions. I don't know if that plays a role. Edit: The links to several failed articles end .stm If you can check to see if they all end that way, at least we can focus on how those articles differ. Last edited by Starson17; 07-23-2010 at 04:34 PM. |
||
07-23-2010, 02:56 PM | #13 |
Member
Posts: 10
Karma: 10
Join Date: Jul 2010
Device: Currently - Sony PRS-600; Sold - Sony PRS-505
|
You are onto something, yes that particular one in 'UK' sections ends in .stm
I also looked at all in the 'Health' and it seems all these articles have an embedded video - so I guess it's probably just putting in something to ignore those tags? |
07-23-2010, 03:30 PM | #14 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Without images: Spoiler:
With images: Spoiler:
|
|
07-23-2010, 03:35 PM | #15 |
Member
Posts: 10
Karma: 10
Join Date: Jul 2010
Device: Currently - Sony PRS-600; Sold - Sony PRS-505
|
Great - I will try these out and get back to you if I see any errors, and yes will include the news URL
Thanks for you help! Edit: I just reviewed two full sections in both versions and no problems so far!! Last edited by elixir; 07-23-2010 at 03:51 PM. Reason: Testing update |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Does calibre de-dupe news feeds? | tom95521 | Calibre | 1 | 08-24-2009 08:04 AM |
automatic tag changes for news feeds in calibre? | SDgirl | Sony Reader | 0 | 06-03-2009 09:11 PM |
Content For My Illiad : BBC Headline News | jæd | iRex | 29 | 05-29-2009 02:49 AM |
Using the Calibre downloaded RSS/news feeds via Mobi2IMP and Impserve | nrapallo | Fictionwise eBookwise | 0 | 03-23-2009 11:03 PM |
Problem with News Feeds | Sydney's Mom | Calibre | 10 | 03-07-2009 02:54 PM |