Hi all,
I'm new here - I had a look around but could not find anything on this problem.
I am working on recipes to scrape WordPress sites and I am running into problems with Calibre v0.8.1 changing the HTML format of pages.
For example, using this recipe:
https://bitbucket.org/wwmm/schtml/sr...tsefton.recipe
With this command:
ebook-convert ptsefton.recipe .epub --debug-pipeline d --test
The recipe fetches the first page which has this code in it:
<ul><li><a href="#id2">Immediate future</a></li><li><a href="#id3">The future</a></li></ul>
I know that this code is still intact when postprocess_html returns the HTML, but in the debug output in the parsed directory it has changed to this:
<ul/><li/><a href="#id2">Immediate future</a><li/><a href="#id3">The future</a>
Does anyone have any idea why this would be happening?
Thanks,
Peter