View Single Post
Old 05-14-2013, 06:30 AM   #30
BobC
Guru
BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.
 
Posts: 691
Karma: 3026110
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro +
Quote:
Originally Posted by roger64 View Post
I can duplicate this exact same behaviour with Calibre produced EPUB of the same odt file. So you should also add this software to the ones with a faulty export function which is not creating correct   ... I'll attach the files. If you need more examples, just tell me.
@roger64 - I also did a conversion yesterday using Calibre and it too renders odt non breaking spaces as hex A000.

Despite this some rendering engines (such as the Calibre viewer) treat the character functionally as a non breaking space and will not break a line there.

The underlying problem seems to be with the odt conversion process. I'm guessing that they all (writer2xhtml, writer2epub and calibre) use the same library. What is needed is a filter that will convert all A000s and similar to the equivalent HTML entities.

It appears that the earlier versions of Sigil carried out this translation on opening the epub.

By the way I have also noted that it's possible in LO to create what appear to be italic characters that don't get converted - I think it is to do with the use of CharPosture=1 which produces a visually slanting character rather than CharPosture=2 which gives a true italic. There may be other characters such as the soft hyphen that don't get properly converted.

BobC

Last edited by BobC; 05-14-2013 at 06:33 AM.
BobC is offline   Reply With Quote