MobileRead Forums - View Single Post

roger64 · 02-24-2014, 12:14 AM

Thanks all for your inputs on this. The expression "some believe" in the wiki is a bit unclear...

Now lets come back to the full original statement from Kovid Goyal

Quote:

Originally Posted by kovidgoyal

Unicode characters work in every single place that entities work. Both unicode characters and entities require a declaration in the header. The DOCTYPE in the case of named entities and the character encoding in the case of unicode characters.

The DOCTYPE is an absolute requirement in order to use named entities in XHTML, while the character encoding is not, since the default encoding for XHTML is UTF-8 when undeclared, which is the encoding the calibre editor uses.

Therefore, named entities are actually *less* likely to work than unicode characters.

From what I can see, the second paragraph of this statement is ambiguous because it's both right and wrong.

- wrong, because formally, EPUB 2.0.1 which is the latest iteration of the official norm before EPUB 3, still relies on xhtml 1.1 specs, for which the DOCTYPE is an absolute requirement. Point.

- right, because technically, this DOCTYPE is only useful for named entities. So even if the DOCTYPE is required formally, it is not needed technically if all the code is made out of (UTF-16) Unicode characters.

I agree with Arios: there seems to be absolutely no compatibility problems arising from this suppression of the DOCTYPE as long as the EPUB 2 contains only (UTF-16) Unicode characters.