![]() |
#1 |
eBook Designer
![]() Posts: 31
Karma: 10
Join Date: Jun 2010
Location: New Jersey
Device: ipad, nook, kindle
|
![]()
I am reaching out here because we want to hear from people making EPUBs and making tools for EPUBs.
What if, in the future, EPUBs could contain HTML files in addition to XHTML? ![]() As you know, technology is constantly evolving. One such technology, currently central to EPUB files, is XML. More exactly the XML syntax of HTML (XHTML). Use of this syntax is declining outside of publishing. In order to provide for the future of the EPUB ecosystem, we plan to remove the requirement for the XML syntax in the EPUB package. This change will not affect the document formats currently allowed in an EPUB publication, it would simply remove the restriction to the XML syntax. We hope to understand how this might affect the community. That way, we can identify potential obstacles and make an easier transition to accepting HTML. Your input is valuable, and will help shape the future of ebooks. Please take this brief four-question survey: https://www.w3.org/wbs/1/epubhtml/
Who We Are This survey was created by the Publishing Maintenance Working Group, part of the World Wide Web Consortium (W3C). Our mission is to maintain the Recommendations for EPUB files and ereaders. We work incrementally to improve the Recommendations, provide additional clarity, and grow the Recommendations for the future of Ebooks. |
![]() |
![]() |
![]() |
#2 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,922
Karma: 6240958
Join Date: Nov 2009
Device: many
|
I am already on record as being totally against adding html parsing rules to the epub3 standard since it will further fracture the epub3 base, will make backwards compatibility with old epub2 only readers next to impossible, and is just a really bad idea in any epub3 spec. Save it for epub 4, if you must.
Second, your proposition that xml is dying is simply incorrect and quite misleading. It should not be used in the lead in to your survey. Talk about adding a bias to a survey! Most current word processors (commonly used by fiction and non-fiction authors) use XML. Xml is also heavily used in text storage and archival and databases. Although html can omit some implied end tags, and handle some void tags differently, using open and closed tags is still part of the standard. Xhtml as a spec has long ago been replaced by xml parsing rules applied to html5 and its current living spec versions. This is not going to change no matter what the W3C thinks. Also in the lead up to your survey link, you neglect to say this change would be made in the epub3 spec, and are thereby hiding the fractures and splits it would make to our current standard. I would have a very different response if this were being proposed for an epub 4 spec, as I suspect many others would. Last edited by KevinH; 07-09-2025 at 03:53 PM. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,466
Karma: 27757440
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I have to agree that this is rather unnecessary. XHTML as a spec might be dead, but using a valid XML serialization of HTML 5 is perfectly fine and supports evolving web standards perfectly. This is what happens in practice today anyway. Anything you can do in HTML 5 works fine when the HTML 5 is serialized as valid XML.
Forcing EPUB software developers to support HTML 5 parsing is just creating unnecessary busy work for no good reason. The *only* advantage I can see for the HTML 5 serialization over the XML serialization is that the former is easier to write by hand. I dont think that is an advantage that justifies the cost. Most EPUB editing tools already have some kind of functionality to either flag or auto-correct invalid XML making writing it not that hard. And I agree with KevinH that this disruptive of a change should be in EPUB 4 otherwise it will just end up getting ignored and un-used like the yoyo-ing that was done with EPUB metadata. |
![]() |
![]() |
![]() |
#4 |
Bibliophist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,291
Karma: 7237230
Join Date: Dec 2021
Location: England
Device: none
|
I don't see what improvement it would make, and it would likely break backward compatibility with epub2 as Kevin says. Would readers want that - I think not. Why is there no question for readers rather than developers?
Edit: And why in an anonymous poll do you want my name, surname, employer and job title as well as my email address? Not very anonymous is it? Last edited by Martinoptic; 07-10-2025 at 04:53 AM. |
![]() |
![]() |
![]() |
#5 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,716
Karma: 205039118
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I'm from the "Don't unnecessarily make API-breaking changes without a new major version (and a new soname)" camp myself. I know that analogy isn't exactly one-to-one, but it fits well enough.
But even that code of the school yard is being broken more and more often these days, it seems (*cough* libxml2 *cough*). I did submit my feelings via the "anonymous" survey. |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,586
Karma: 79436940
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
|
There are some interesting discussions out there. See for instance "Allow pure HTML 5 in EPUB 3" https://www.edrlab.org/2025/07/06/al...ml5-in-epub-3/
|
![]() |
![]() |
![]() |
#7 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,922
Karma: 6240958
Join Date: Nov 2009
Device: many
|
Interesting article. Yes a new standard would be a big improvement. Trying to shoehorn it into epub3 would not.
That said, Thorium will not work with pure html5 directly (I tested it) but it sounds like they are working on it. |
![]() |
![]() |
![]() |
#8 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,586
Karma: 79436940
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
|
Another source worth following is the Public Mailing list of the "W3 Publishing Maintenance Group" especially the Agenda and Minutes from various meetings. See https://lists.w3.org/Archives/Public/public-pm-wg/
|
![]() |
![]() |
![]() |
#9 | ||
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,466
Karma: 27757440
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Quote:
Quote:
All-in-all it looks like there is a ton of confusion in this space. Starting with the people proposing this change who dont seem to understand that the serialization of HTML is irrelevant to what features it supports. Like I said in my previous post the *only* pro for non XML serialization is that it is easier to author using a plain text editor. |
||
![]() |
![]() |
![]() |
#10 |
eBook Designer
![]() Posts: 31
Karma: 10
Join Date: Jun 2010
Location: New Jersey
Device: ipad, nook, kindle
|
Thanks to everyone for replying, and thank you to those who filled out the survey.
I’ll be sure to share the anonymized results. The survey is open until mid-September, so I’ll have complete results sometime after that. |
![]() |
![]() |
![]() |
#11 | |
eBook Designer
![]() Posts: 31
Karma: 10
Join Date: Jun 2010
Location: New Jersey
Device: ipad, nook, kindle
|
Anonymous poll and EPUB user input
Quote:
When you say users, do you mean people who use EPUB testing and editing tools, and/or user-readers who read Ebooks? What question would you propose? |
|
![]() |
![]() |
![]() |
#12 |
Bibliophist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,291
Karma: 7237230
Join Date: Dec 2021
Location: England
Device: none
|
Thank you for your reply. I was meaning people who read ebooks and may occasionally tweak them.
As I didn't get to the first question in the survey (as I'm not prepared to give all the details requested) it may well be that a further question isn't needed? As long as it isn't a requirement to have a job in e-publishing (which is what I took the sign up to suggest) then maybe the survey is ok as it stands. What do other people think? Last edited by Martinoptic; 07-11-2025 at 02:22 PM. |
![]() |
![]() |
![]() |
#13 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 80,096
Karma: 148565303
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Look how long it's taken for ePub 3 to be supported and even then it's still not fully supported enough. We don't need more changes that will take too long to support or may never be supported well enough. So lets can this nonsense and leave things as they are.
|
![]() |
![]() |
![]() |
#14 | |
eBook Designer
![]() Posts: 31
Karma: 10
Join Date: Jun 2010
Location: New Jersey
Device: ipad, nook, kindle
|
Quote:
If Reading Systems are based on User Agents (same as the web) and XML is deprecated in the HTML spec, what happens to the books? What happens to the workflows? And the tools? If we wait until XML is no longer supported to introduce HTML, how long will it take for the Ebook ecosystem to respond? I appreciate XML. I’ve made good use of it and its precise namespaces for literally years now. I’m sorry, and frankly a little surprised, that the larger web community is intent on deprecating it. That is out of scope for the Publishing Group. We literally cannot fork all of HTML. The question is how do we help Reading Systems, Developers, Publishers and Readers have a smooth transition? Introducing HTML sooner rather than later could give the ecosystem time to adjust. I hope you have filled out the survey, your perspective is important. https://www.w3.org/wbs/1/epubhtml/ Comments close on September 15. I will post the anonymized results here when they are published. |
|
![]() |
![]() |
![]() |
#15 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,922
Karma: 6240958
Join Date: Nov 2009
Device: many
|
I really wish this talk about the xml serialization of html going away would stop. It is pure nonsense spread to create doubt. True FUD.
XML parsing rules are not going to be deprecated in the whatwg spec! Most of the rules on writing html explicitly allow you to fully close every tag, not use tags out of order, etc. In other words most of the xml serialization is legal html and will continue to be so. Other than case sensitivity, namespaces, and the use of attributes and a handful or void tags, html allows xml parsing rules. It is just a serialization that makes for easy parsing tools without the need for a full fledged html spaghetti code parser. It is the resulting DOM tree that matters. Please stop spreading misinformation. And as for archival formats (and epubs need to be able to meet those archival standards), true xml is the dominant text storage technology, and in use by most wordprocessors and office suites. And as for increasing epub adoption, right now you can take html code and put in in Calibre or Sigil and it will nicely be "fixed" to meet the xml serialization rules needed for epub. So using html for authoring works already exists. We (Sigil) already encourage users to use Word, OpenOffice, LibreOffice (ie. real writing tools) to create the source matter and then they can use Sigil or Calibre to make it meet the epub standards. All that you are going to achieve by adding html as a allowed core media type is to further fragment the epub publishing marketplace for no real gain. Just increased costs, along with yet again delaying the industry adoption. Last edited by KevinH; 09-04-2025 at 11:23 AM. |
![]() |
![]() |
![]() |
Tags |
eprdctn, epub, epub app, epub application, html code |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Issue converting EPUBs to HTML | wolf123 | Conversion | 2 | 10-29-2021 12:23 PM |
Epubs with oversize html files | adrian1944 | Conversion | 6 | 01-01-2014 11:15 AM |
Help! Problem with HTML tables in epubs | nazzing | ePub | 3 | 06-05-2013 10:15 AM |
Touch Kobo Touch and html links in epubs | leaperk | Kobo Reader | 1 | 07-05-2011 02:56 PM |
Is it possible to change how Calibre formats HTML for Epubs | chief | Calibre | 8 | 07-07-2010 03:04 PM |