Quote:
Originally Posted by Sablerose
We'll, it might be easiest to change my Word defaults to strip out the formatting as I paste text in. Since I verify all italics as I go already, I might as well redo them, and see how that affects the conversion results.
The ctrl-q and ctrl-space idea, I hesitate at, since I don't know what those functions do. But I'll take a look and see.
|
@
Sablerose -
300+ Useful Word 2007 KB Shortcuts That site also has shortcuts for Word 2010 - and many other programs.
Quote:
Originally Posted by Sablerose
Oh, and another reason I want this to be consistent. I have ~absolutely no~ idea what the code line you got in the doc XML means. And I don't know anything about regex either. Besides, simple is better and removing the problem will make redoing things faster.
|
@
Sablerose - I only posted the XML from the DOCX, to demonstrate that the weird XHTML in the ePUB is a direct result converting the similarly weird XML in the DOCX. The 'weird' is that 'not' is split into 'n' and 'ot'.
I updated the XML fragment I posted earlier - after a Tidy
The 'n' and 'ot' are at the beginning and end of the both the XHTML and the XML. I don't expect anyone to actually comprehend the DOCX XML - except maybe Kovid
BR