I really wish this talk about the xml serialization of html going away would stop. It is pure nonsense spread to create doubt. True FUD.
XML parsing rules are not going to be deprecated in the whatwg spec! Most of the rules on writing html explicitly allow you to fully close every tag, not use tags out of order, etc. In other words most of the xml serialization is legal html and will continue to be so. Other than case sensitivity, namespaces, and the use of attributes and a handful or void tags, html allows xml parsing rules. It is just a serialization that makes for easy parsing tools without the need for a full fledged html spaghetti code parser.
It is the resulting DOM tree that matters.
Please stop spreading misinformation.
And as for archival formats (and epubs need to be able to meet those archival standards), true xml is the dominant text storage technology, and in use by most wordprocessors and office suites.
And as for increasing epub adoption, right now you can take html code and put in in Calibre or Sigil and it will nicely be "fixed" to meet the xml serialization rules needed for epub. So using html for authoring works already exists. We (Sigil) already encourage users to use Word, OpenOffice, LibreOffice (ie. real writing tools) to create the source matter and then they can use Sigil or Calibre to make it meet the epub standards.
All that you are going to achieve by adding html as a allowed core media type is to further fragment the epub publishing marketplace for no real gain. Just increased costs, along with yet again delaying the industry adoption.
Last edited by KevinH; 09-04-2025 at 11:23 AM.
|