Actually, Calibre is doing remarkably well in parsing the html from the Mediawiki API, with the sole exception of the HTML for this "Contents" infobox, so I don't really see any need for additional changes to the html. Since no additional explanation of how the HTML input plugin works or what it expects seems to be available, I am now simply stripping out the problematic chunk of html from the original document before sending it to Calibre -- which is handling the rest of the document very nicely!
|