View Single Post
Old 11-14-2020, 11:42 PM   #371
Thasaidon
Hedge Wizard
Thasaidon ought to be getting tired of karma fortunes by now.Thasaidon ought to be getting tired of karma fortunes by now.Thasaidon ought to be getting tired of karma fortunes by now.Thasaidon ought to be getting tired of karma fortunes by now.Thasaidon ought to be getting tired of karma fortunes by now.Thasaidon ought to be getting tired of karma fortunes by now.Thasaidon ought to be getting tired of karma fortunes by now.Thasaidon ought to be getting tired of karma fortunes by now.Thasaidon ought to be getting tired of karma fortunes by now.Thasaidon ought to be getting tired of karma fortunes by now.Thasaidon ought to be getting tired of karma fortunes by now.
 
Thasaidon's Avatar
 
Posts: 802
Karma: 19999999
Join Date: May 2011
Location: UK/Philippines
Device: Kobo Touch, Nook Simple
Quote:
Originally Posted by KevinH View Post
Yes, technically epub is xhtml thus xml. So to be rigorous to the epub2 spec, having a doctype that specifies the version of xhtml and the named entities should be important. My guess is most ereaders are serving these pages as html or adding the doctype on the fly if missing.

To be safest it technically should specify a doctype. Sigil has always used one and added it where needed during load. Only since Sigil 1.0 where Sigil stopped moving and updating every page on initial load has there been pages without it inside Sigil. That is why I added it to our well-formed check on epub load so it can be detected and fixed automatically like it was in the past if the user wants it to be.

If a epub does not use named entities outside of those recognized by xml, and instead uses no entities or only numeric entities, then you could probably dispense with the DOCTYPE safely. But since under epub2, Sigil supports named entities (such as &nbsp Sigil needs and enforces the DOCTYPE.

Calibre on the other hand removes all named entities and replaces them with the correct unicode character, so it can then remove the epub2 doctype safely.

Sigil uses the doctype as specified in the epub2 (2.0.1) specification for xhtml files.

Thank you for the explanations KebinH and Hobnail.

I understand now. That is something new I learned today.

My knowledge of XML, html and xml has been acquired by trial and error and editing ePubs and a little reading. It is usually sufficient for my needs but I know my knowledge is limited which is why I am a "Hedge Wizard" and not a Grand High Sorcerer like KevinH and some others on these forums.
Thasaidon is offline   Reply With Quote