Wall Street Journal--feedparser error?
Thought I'd start a new thread for this...Appears to be an different issue than the previous wsj thread.
I started getting errors on WSJ starting Saturday. parse_index works correctly, but when articles are parsed, each returns an error of "Initial parse failed, using more forgiving parsers", resulting in an epub with only empty articles.
A quick search revealed that error message originates with feedparser...I'm guessing the solution is then to alter the downloaded html in some manner in order to conform to feedparser, but I'm not sure how to do this.
Logfile attached. Any advice would be greatly appreciated.
Thanks,
Dale
|