Easiest alternative
We'll, it might be easiest to change my Word defaults to strip out the formatting as I paste text in. Since I verify all italics as I go already, I might as well redo them, and see how that affects the conversion results.
The ctrl-q and ctrl-space idea, I hesitate at, since I don't know what those functions do. But I'll take a look and see.
As far as the idea of a direct LIT conversion, I don't know for sure what type of file these docx files came from. I had already done a cut/paste from the original to make the docx file I converted. And I get my eBooks in all types of files, including PDFs, which I always remake to docx before I try them in Calibre.
I think as a test case, I will do a couple chapters of a fresh book (to see which way it goes from the formatting as is), then redo the same text, with Word stripping the formatting and my redoing it myself.
I'll let you all know my results.
Oh, and another reason I want this to be consistent. I have ~absolutely no~ idea what the code line you got in the doc XML means. And I don't know anything about regex either. Besides, simple is better and removing the problem will make redoing things faster.
|