View Single Post
Old 07-23-2010, 12:58 PM   #11
elixir
Member
elixir began at the beginning.
 
Posts: 10
Karma: 10
Join Date: Jul 2010
Device: Currently - Sony PRS-600; Sold - Sony PRS-505
Starson17, thanks for reviewing and yes there are very few blank pages.

I also ran with the --test option, in addition I also had the -vv option for verbose output. That resulted in a few parsing errors, they are in the attached bbc-errors.txt file.

One of the errors corresponds to the article UK->"PM's use of crime figures 'is propaganda'"
Initial parse failed:
Parsing file 'feed_10/article_35/index.html' as HTML
Forcing feed_10/article_35/index.html into XHTML namespace
Forcing index.html into XHTML namespace


A few more are in the Health Section:
Iraq veteran's struggle with PTSD [Thu, 22 Jul 02:51]
Vital care lacking for mini-strokes [Thu, 22 Jul 04:19]
Ex-conjoined twins reunited with mother [Thu, 22 Jul 13:13]
Drugs robots help out hospital staff [Wed, 21 Jul 13:11]
I've uploaded my entire archive to Dropbox here:

http://dl.dropbox.com/u/1925993/calibre/bbc.zip
Attached Files
File Type: txt bbc-errors.txt (2.3 KB, 379 views)

Last edited by elixir; 07-23-2010 at 01:24 PM. Reason: added more description
elixir is offline   Reply With Quote