View Single Post
Old 03-21-2009, 05:41 PM   #31
cerement
Groupie
cerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it is
 
cerement's Avatar
 
Posts: 170
Karma: 2000
Join Date: Apr 2008
Location: San José, CA
Device: Amazon Kindle 1, Sony PRS-300, Amazon Kindle 3
See Kirtai's response above ...
Quote:
Originally Posted by JSWolf View Post
Unless the eBook in question has need beyond the bounds of HTML, then HTML is the optimum format for multiple format generation.
HTML relies on conventions, not standards and is not stringent enough to minimize errors that creep in during transposition to other formats. You end up using class="aaa" far more often to get things to behave. Jon Noring has commented already on the shortcomings of XHTML when dealing with eBooks.

Quote:
Originally Posted by JSWolf View Post
Calibre can take that HTML and generate ePub, LRF, LIT, and Mobipocket. So why would you need this nonstandard DTBook in most cases?
DTBook, TEI, DocBook are all standards, are all far more rigorous than (X)HTML. DocBook (and it's parent SGML) have been around far longer than HTML. ePub allows the primary text component to be in either DTBook or XHTML (but falls short with the "packing list" not allowing non-linear reading the way DITA does). Whereas the HTML generated by many WYSIWYG editors is a hideously bloated tag soup that wastes Calibre's processing features, can easily break conversions, and results in files MANY times larger than necessary (MSWord for a long time would wrap <font face="xxx" size="yyy"> tags around every single paragraph).

Most of these formats also have the advantage of freely available tools (including a massive toolset on O'Reilly for DocBook) for conversion to several outputs, and if the tool doesn't exist, some work with XSLT can handle those conversions.

TLDR: HTML is fine for an end format, but it is nowhere near clean enough for the beginning format.
cerement is offline   Reply With Quote