Quote:
Originally Posted by KNickerson
(Disclaimer: This is the first I've looked into the recipes).
I'm thinking it's a download problem. I copied the script off and ran
ebook-convert PolitifactKJN.recipe .epub -vv --debug-pipeline debug
Then I found a bad section, and hunted it down in debug\input.
The index.html there is just garbage. Isn't that the raw stuff downloaded before the recipe kicks in? If not, how do I get to the raw stuff?
|
That should be the raw stuff, but to be sure you can do this:
Code:
def preprocess_html(self, soup):
print 'The raw stuff is: ', soup
return soup
Is it always the same crud at the same point?