MobileRead Forums - View Single Post

KevinH · 07-27-2017, 09:53 AM

Under the spec, all entities are converted to their character representation during parsing. For most entities that is fine, but for some whitespace related entities, that would be problematic as you can then not differentiate a normal space from a non-breaking space on sight. To that end, Sigil users can choose to preserve entities (restore the entity representation in the text after parsing and serialization).

For epub2, named entities are allowed and so typically & nbsp ; is added to the list of entities to preserve. Upon first load, all non-breaking spaces will be converted to their named entitity equivalent.

For epub3, only numeric entities are allowed (aside from the official xml named entities), so leaving & nbsp ; in an epub3 would result in warnings/errors from epubcheck. For epub3, Sigil users should use the numeric version of the entity (not the named entity version) to get a non-breaking space. This is typically written as & #160 ; although there are other hex based versions of the same thing.

So again, on first load and on any other parsing/serializing activity, the non-breaking space characters will be converted to their numeric entity equivalent.

If you are not seeing this, exactly what version of Sigil are you using? Just how old is it?

07-27-2017, 09:53 AM	#6
KevinH Sigil Developer Posts: 8,809 Karma: 6000000 Join Date: Nov 2009 Device: many	Under the spec, all entities are converted to their character representation during parsing. For most entities that is fine, but for some whitespace related entities, that would be problematic as you can then not differentiate a normal space from a non-breaking space on sight. To that end, Sigil users can choose to preserve entities (restore the entity representation in the text after parsing and serialization). For epub2, named entities are allowed and so typically & nbsp ; is added to the list of entities to preserve. Upon first load, all non-breaking spaces will be converted to their named entitity equivalent. For epub3, only numeric entities are allowed (aside from the official xml named entities), so leaving & nbsp ; in an epub3 would result in warnings/errors from epubcheck. For epub3, Sigil users should use the numeric version of the entity (not the named entity version) to get a non-breaking space. This is typically written as & #160 ; although there are other hex based versions of the same thing. So again, on first load and on any other parsing/serializing activity, the non-breaking space characters will be converted to their numeric entity equivalent. If you are not seeing this, exactly what version of Sigil are you using? Just how old is it? Last edited by KevinH; 07-27-2017 at 10:13 AM.