Quote:
Originally Posted by elixir
I also ran with the --test option, in addition I also had the -vv option for verbose output.
|
Yes, I also always run with -vv.
Quote:
That resulted in a few parsing errors, they are in the attached bbc-errors.txt file.
One of the errors corresponds to the article UK->"PM's use of crime figures 'is propaganda'"
|
I checked this article, and mine was also empty, with a parsing error.
I'm running some tests and will see if I can work around the parsing problem. Somewhere I read that newer versions of Beautiful soup use an HTML parser that is less tolerant of malformed code than older versions. I don't know if that plays a role.
Edit: The links to several failed articles end .stm If you can check to see if they all end that way, at least we can focus on how those articles differ.