Quote:
Originally Posted by Hitch
I've not found any insta-conversion program that works worth a crap. Some of the guys here on MR have stated that using the Calibre EDITOR (not the converter) works really well. This is using the Editor as an import tool--so, you'd export your HTML to your HTML editor, clean it up a bit, and then paste it into the Calibre editor, and pop out an ePUB. Of course, you could also use the Calibre editor as an HTML editor, too, to simplify it even further.
|
Well, importing an HTML file into calibre's Editor is not really different from importing an HTML file into Sigil. Mostly because the file is imported as-is in either case.
But calibre includes a DOCX Input plugin* for the conversion backend, which is also used to open a DOCX directly in the Editor.
Unlike conversion, importing a DOCX into the Editor does not go through stage two of the conversion pipeline -- the CSS flattening.
Conversion:
1) Input plugin, DOCX ==> OEB
2) CSS flattening and other monstrous nightmares, OEB ==> OEB
3) Output plugin, OEB ==> EPUB
Editor:
1) Input plugin, DOCX ==> OEB
2) OEB ==> EPUB
See calibre's manual:
Introduction to conversion
...
...
Given that the Editor now understands DOCX properly, I can only regard converting DOCX ==> EPUB in order to get an EPUB for editing, as nothing short of medium insanity.
...
...
So, "Import an (HTML or) DOCX file as a new book" is not substantially different from using Toxaris' Word addin to export EPUB; obviously, the Word addin uses a different method for translating DOCX into an EPUB, but the same general goal is present in both. I have no idea which route you might consider to have preferable output -- maybe they're both good/bad in different ways.
There's a fun experiment for you.

Compare a series of books, exported with Toxaris' tools (just a straight export, none of the other fancy pre-processing options) vs. imported directly to the calibre Editor.
Then tell us all, in your expert, professional opinion, which one does a better job of preserving styles and producing as clean an EPUB as you can actually get from Word.
EDIT:
* -- DOCX Input plugin is one of the builtin plugins. Sorry

technically most things in calibre are expressed as plugins, that's the reason why people can write new, external plugins for new formats, e.g. the KEPUB plugins.