Quote:
Originally Posted by Hitch
Kim, I have mad respect for your skills, but we do this ALL DAY LONG, and we don't get those spans, in exporting to HTML/XML/etc.. You didn't get them, unless I've misread your post, in the third image--where you exported directly to HTML. Right?
|
Wrong. I get those spans in HTML (unfiltered flavor only!) and in ODT. Study the third image a little closer

The spans have a different, more detailed styling, but they're there all the same.
Quote:
Originally Posted by Hitch
Are you saying that the exports to .odt format can't avoid the spans?
|
Apparently not, that is: not in the experiments I have made trying to solve this specific issue. (Which is of course not neither 100's or 1000's, I must admit. I usually don't make epubs from neither Word nor LibreOffice. But both programs have some useful addins for cleaning OCR, before I process the text)
Quote:
Originally Posted by Hitch
My take on this is that these are effectively junk spans, created because the paragraphs are unstyled. Period. If the file is made correctly, using normal named styles, these should not appear
|
The paragraphs are styled in the the templates "Normal" style. And yes, it's junk, but Microsoft must have had some reason to do it so (hopefully).
Quote:
Originally Posted by Hitch
What happens if these files are styled as they should be, before these save-as and export experiments? Rather than just using "Normal?"
|
Why should "Normal" not be correct? I mean, "Normal" may not be the best choice when we talk about conversion to epub using whatever tool one likes best, but after all, 90% or more of the text I see in all documents coming my way (my alter ego works in the danish union for "white collar workers", which is where I use the Office programs professionally) are styled as "Normal". When I "translate" Word documents from our jounalist to html for our website (using Pandoc), "Normal" translate to <p>, which is as it should be, in my opinion anyway.
Quote:
Originally Posted by Hitch
Here's my stupid question, then: why is the .odt file being subsequently exported to .odt, anyway? Hasn't the process been described as .odt-->Macro-->Word-->HTML?
|
Dunno. It's Roger64s way of doing things. You do it your way, I do it mine and so on. We have a saying in danish (guess it exists in english too, just can't find the expression right now): "Man's will, man's heaven". Each to his own.
Quote:
Originally Posted by Hitch
The only time we use the default para (class) is, as Barnhill notes, when we're swapping out direct styling for spans and vice-versa, because it's useful in Word searches, if those are needing to be done there, instead of in HTML and/or INDD. We don't see this, unless we suffer some type of brain-seizure and export an uncleaned, unfixed client's file to HTML, and even THEN, we typically don't see them.
|
You normally don't
use the DefaultParagraphFont style as such. It's just there, lurking in the dark, sometimes springing to life as in Roger64's files. In Word 2013/2016 it's even hidden.
Quote:
Originally Posted by Hitch
Whatever. As long as Roger's issue or non-issue, or whatever, is solved.
Hitch
|
Exactly