Starson17, thanks for reviewing and yes there are very few blank pages.
I also ran with the --test option, in addition I also had the -vv option for verbose output. That resulted in a few parsing errors, they are in the attached bbc-errors.txt file.
One of the errors corresponds to the article
UK->"PM's use of crime figures 'is propaganda'"
Initial parse failed:
Parsing file 'feed_10/article_35/index.html' as HTML
Forcing feed_10/article_35/index.html into XHTML namespace
Forcing index.html into XHTML namespace
A few more are in the
Health Section:
Iraq veteran's struggle with PTSD [Thu, 22 Jul 02:51]
Vital care lacking for mini-strokes [Thu, 22 Jul 04:19]
Ex-conjoined twins reunited with mother [Thu, 22 Jul 13:13]
Drugs robots help out hospital staff [Wed, 21 Jul 13:11]
I've uploaded my entire archive to
Dropbox here:
http://dl.dropbox.com/u/1925993/calibre/bbc.zip