I'll reiterate that Calibre performs excellently (probably the best) with html source (which, as OP said, shows up as zip once you add it to Calibre). In fact, I've gotten to starting with clean html (converted from any other source) and feeding it to Calibre after doing any cleaning up/proofing. (Clean) html is probably the best format to do post-OCR cleanup in.
By the way, you can drag an entire folder (with an html hierarchy - several files, relatively linked images, etc.) to the edit metadata ->formats window and it adds as a nice zip file.
|