View Single Post
Old 01-09-2014, 01:39 PM   #18
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,800
Karma: 30237628
Join Date: Mar 2012
Location: Sydney Australia
Device: none
I think the issue of what you get in the ePUB XHTML depends on 'what you do' in Word with cut & paste and editing - in that shot I posted I changed the ". Not" to "… not" in Word and did a conversion.

The fragmentation of the not that you see in the ePUB XHTML reflects what's in the Word DOCX XML

ePUB XHTML
Code:
<i class="calibre1">n</i><span class="text1">ot</span>
DOCX XML
Code:
         <w:r w:rsidR="00AB4F90" w:rsidRPr="00160E46">
            <w:t>n</w:t>
         </w:r>
         <w:r w:rsidRPr="00160E46">
            <w:t>ot.</w:t>
         </w:r>
So my conclusion is that the <span class:"text1>blah blah</span> sequences stem directly from the XML that Word creates in its DOCX files. And that as one does more editing on the DOCX the XML becomes more disorderly. Which after conversion results in less than optimal XHTML - ie Garbage In Garbage Out.

One way of ensuring better consistency might be to paste plain ASCII text into the DOCX - you can achieve this via the Word Options->Advanced->Cut, copy and Paste settings. You'd then have to do all the font styling manually.

If the examples you posted originate from LIT it might be interesting to see the XHTML that a LIT to EPUB conversion creates.

BR

Last edited by BetterRed; 01-09-2014 at 04:08 PM. Reason: did a Tidy on the XML fragment
BetterRed is online now   Reply With Quote