Quote:
Originally Posted by DTM
I do exactly the same thing. The attached file will unzip into a Word template file that contains several macros. Use this for the document you want to convert to HTML, then run the macro called Word2HTML. It will clean up the double-paragraph markers and end-of-line paragraph markers you commonly get in text documents, mark word heading1 - heading5 with <h1> - <h5>, replace special characters with escape codes, double-hyphens with em dashes, and more. (If you don't want to do all of the operations--see CAUTION, below--you can run the other individual macros one at a time, if you prefer.)
Now save the document as a text file, add the proper <html>, <body>, etc. tags at the top and bottom, and you'll have something fit for a clean import into Sigil. Hope this helps.
CAUTION: This is quite useful, but not perfect and so is provided as-is, no warranty, use at your own risk, and the other usual disclaimers. It assumes you have a double-paragraph mark between paragraphs, as is common for Gutenberg and other text files. If you actually have just a single paragraph marker at the end of each paragraph, it'll turn the whole document into one huge paragraph. It also clears "unnecessary" white space, so if you have a table or tabs/spaces at the start of a paragraph, or other such formatting, you'll lose it. This is basically intended for documents that are paragraphs of text with chapter headings.
|
DTM:
Is this for Word/Office 2007? My older Word 2003 for XP doesn't recognize the template format. thought I'd try it on a itchy file I have, but it looks like I'll have to run it through the usual hoops--thanks anyway.
Thanks,
Hitch