Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 03-21-2009, 05:41 PM   #31
cerement
Groupie
cerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it is
 
cerement's Avatar
 
Posts: 170
Karma: 2000
Join Date: Apr 2008
Location: San José, CA
Device: Amazon Kindle 1, Sony PRS-300, Amazon Kindle 3
See Kirtai's response above ...
Quote:
Originally Posted by JSWolf View Post
Unless the eBook in question has need beyond the bounds of HTML, then HTML is the optimum format for multiple format generation.
HTML relies on conventions, not standards and is not stringent enough to minimize errors that creep in during transposition to other formats. You end up using class="aaa" far more often to get things to behave. Jon Noring has commented already on the shortcomings of XHTML when dealing with eBooks.

Quote:
Originally Posted by JSWolf View Post
Calibre can take that HTML and generate ePub, LRF, LIT, and Mobipocket. So why would you need this nonstandard DTBook in most cases?
DTBook, TEI, DocBook are all standards, are all far more rigorous than (X)HTML. DocBook (and it's parent SGML) have been around far longer than HTML. ePub allows the primary text component to be in either DTBook or XHTML (but falls short with the "packing list" not allowing non-linear reading the way DITA does). Whereas the HTML generated by many WYSIWYG editors is a hideously bloated tag soup that wastes Calibre's processing features, can easily break conversions, and results in files MANY times larger than necessary (MSWord for a long time would wrap <font face="xxx" size="yyy"> tags around every single paragraph).

Most of these formats also have the advantage of freely available tools (including a massive toolset on O'Reilly for DocBook) for conversion to several outputs, and if the tool doesn't exist, some work with XSLT can handle those conversions.

TLDR: HTML is fine for an end format, but it is nowhere near clean enough for the beginning format.
cerement is offline   Reply With Quote
Old 03-23-2009, 12:20 PM   #32
Kirtai
Addict
Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.
 
Posts: 304
Karma: 2454436
Join Date: Sep 2008
Device: PRS-505, PRS-650, iPad, Samsung Galaxy SII (JB), Google Nexus 7 (2013)
Now that I think about it, ePub containing DTBook might be the best interim standard format for sequential texts like novels and such until a better format becomes available.
Especially if Calibre and other tools start supporting it. Which they should
Kirtai is offline   Reply With Quote
Advert
Old 03-24-2009, 07:22 AM   #33
Sweetpea
Grand Sorcerer
Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.
 
Sweetpea's Avatar
 
Posts: 9,707
Karma: 32763414
Join Date: Dec 2008
Location: Krewerd
Device: Pocketbook Inkpad 4 Color; Samsung Galaxy Tab S6
Quote:
Originally Posted by cerement View Post
Whereas the HTML generated by many WYSIWYG editors is a hideously bloated tag soup that wastes Calibre's processing features, can easily break conversions, and results in files MANY times larger than necessary (MSWord for a long time would wrap <font face="xxx" size="yyy"> tags around every single paragraph).
I wouldn't ever use a WYSIWYG editor for something you'd like to call a source format. And I put MSWord in that group as well (actually, any wordprocessor).

From what I saw (took a quick look) from that DTBook, but that too is just an XML with a DTD (which XHTML is too).


I still prefer XHTML for two reasons. One, I can edit it with notepad if needs be and I can "preview" it in a browser. As long as you don't use a WYSIWYG editor HTML is perfect for the job.


------

Actually, I'd advice against using a WYSIWYG or MSWord (or any other wordprocessor) for creating HTML files at all...
Sweetpea is offline   Reply With Quote
Old 03-27-2009, 08:20 PM   #34
Kirtai
Addict
Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.
 
Posts: 304
Karma: 2454436
Join Date: Sep 2008
Device: PRS-505, PRS-650, iPad, Samsung Galaxy SII (JB), Google Nexus 7 (2013)
As another option, has anyone looked at Fictionbook2 as a master format?
Kirtai is offline   Reply With Quote
Old 03-28-2009, 03:23 PM   #35
Kirtai
Addict
Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.
 
Posts: 304
Karma: 2454436
Join Date: Sep 2008
Device: PRS-505, PRS-650, iPad, Samsung Galaxy SII (JB), Google Nexus 7 (2013)
I've been giving this some more thought and it occurs to me that maybe the best approach would be to build better tools and use the best format for each type of book.

e.g.
For linear books like fiction, use DTBook or Fictionbook2
For technical books, use DocBook
For topic/map books like encyclopedias and recipe books, use DITA
For complex office documents, use OpenDoc.
For complex books that don't fit into any other category, use TEI

A single toolchain could convert all of these into any output format, or readers could even handle them directly. Though this might be a bit pie in the sky

I've heard that the Rosetta tools go even further by storing the document in a database format from which it can be exported in any format, but I've been unable to find any further information on that.
Kirtai is offline   Reply With Quote
Advert
Old 03-28-2009, 05:41 PM   #36
cerement
Groupie
cerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it is
 
cerement's Avatar
 
Posts: 170
Karma: 2000
Join Date: Apr 2008
Location: San José, CA
Device: Amazon Kindle 1, Sony PRS-300, Amazon Kindle 3
Quote:
Originally Posted by Kirtai View Post
I've heard that the Rosetta tools go even further by storing the document in a database format from which it can be exported in any format, but I've been unable to find any further information on that.
TomeRaider on Palm basically did that, everything from short linear texts to huge encyclopedias were stored as databases. DITA has the advantage in this respect in that it makes it very easy to chunk and reassemble anything and then integrate the chunks with a CMS of your choice (ex. DITA for WordPress) to handle customized delivery.
cerement is offline   Reply With Quote
Old 03-28-2009, 06:11 PM   #37
Hadrien
Feedbooks.com Co-Founder
Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.
 
Hadrien's Avatar
 
Posts: 2,263
Karma: 145123
Join Date: Nov 2006
Location: Paris, France
Device: Sony PRS-t-1/350/300/500/505/600/700, Nexus S, iPad
If you're capable of writing your own XSLT, it doesn't really matter which DTD you're using, you can even create your own.

There's no perfect fit, each DTD will only solve a limited sub-set of problems.
The easiest one to work with is probably DTBook since it's basically XHTML with just a few semantic elements. TEI is incredibly powerful but might be overkill for what most people need.
Hadrien is offline   Reply With Quote
Old 03-29-2009, 01:21 PM   #38
Kirtai
Addict
Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.
 
Posts: 304
Karma: 2454436
Join Date: Sep 2008
Device: PRS-505, PRS-650, iPad, Samsung Galaxy SII (JB), Google Nexus 7 (2013)
Quote:
Originally Posted by Hadrien View Post
If you're capable of writing your own XSLT, it doesn't really matter which DTD you're using, you can even create your own.

There's no perfect fit, each DTD will only solve a limited sub-set of problems.
The easiest one to work with is probably DTBook since it's basically XHTML with just a few semantic elements. TEI is incredibly powerful but might be overkill for what most people need.
I expect this is what will happen. It also has the advantage of pushing most of the complexity into the tools which is exactly where it should be.
OTOH, it will require a lot of work for those of us who want to do it now, before the tools mature. I don't know if I'm happy or sad about that
Kirtai is offline   Reply With Quote
Old 03-29-2009, 03:26 PM   #39
cerement
Groupie
cerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it is
 
cerement's Avatar
 
Posts: 170
Karma: 2000
Join Date: Apr 2008
Location: San José, CA
Device: Amazon Kindle 1, Sony PRS-300, Amazon Kindle 3
Quote:
Originally Posted by Kirtai View Post
OTOH, it will require a lot of work for those of us who want to do it now, before the tools mature. I don't know if I'm happy or sad about that
That's the biggest problem, I'm looking for a so-called "master format" when the whole publishing industry is still in it's infancy as far as electronic books are concerned

Quote:
Originally Posted by Hadrien View Post
There's no perfect fit, each DTD will only solve a limited sub-set of problems. The easiest one to work with is probably DTBook since it's basically XHTML with just a few semantic elements. TEI is incredibly powerful but might be overkill for what most people need.
Since we've managed to drag you into this topic, what format does FeedBooks have the least problem ingesting? Or, better still, which format loses the least when uploaded and converted to FeedBooks internal format?
cerement is offline   Reply With Quote
Old 03-29-2009, 03:49 PM   #40
Hadrien
Feedbooks.com Co-Founder
Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.
 
Hadrien's Avatar
 
Posts: 2,263
Karma: 145123
Join Date: Nov 2006
Location: Paris, France
Device: Sony PRS-t-1/350/300/500/505/600/700, Nexus S, iPad
Quote:
Originally Posted by cerement View Post
Since we've managed to drag you into this topic, what format does FeedBooks have the least problem ingesting? Or, better still, which format loses the least when uploaded and converted to FeedBooks internal format?
The closest one would be DTBook as we're essentially adding some semantics to XHTML (parts/chapters/sections, notes etc.). The semantic elements though are not exactly expressed in the markup, as the system is more designed around the idea of using multiple chunks rather than a single source file.
Hadrien is offline   Reply With Quote
Old 03-29-2009, 04:46 PM   #41
cerement
Groupie
cerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it is
 
cerement's Avatar
 
Posts: 170
Karma: 2000
Join Date: Apr 2008
Location: San José, CA
Device: Amazon Kindle 1, Sony PRS-300, Amazon Kindle 3
Quote:
Originally Posted by Hadrien View Post
The semantic elements though are not exactly expressed in the markup, as the system is more designed around the idea of using multiple chunks rather than a single source file.
Heh, that's been my side challenge the last couple of weeks. I have had no problem finding chunked XHTML ePub samples, but so far I'm having no luck finding a chunked DTBook ePub sample. Even the ePub tutorials all seem to focus on XHTML with nothing more than a side note "oh, you could also use DTBook if you wanted" ...
cerement is offline   Reply With Quote
Old 03-29-2009, 05:54 PM   #42
Hadrien
Feedbooks.com Co-Founder
Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.
 
Hadrien's Avatar
 
Posts: 2,263
Karma: 145123
Join Date: Nov 2006
Location: Paris, France
Device: Sony PRS-t-1/350/300/500/505/600/700, Nexus S, iPad
Quote:
Originally Posted by cerement View Post
Heh, that's been my side challenge the last couple of weeks. I have had no problem finding chunked XHTML ePub samples, but so far I'm having no luck finding a chunked DTBook ePub sample. Even the ePub tutorials all seem to focus on XHTML with nothing more than a side note "oh, you could also use DTBook if you wanted" ...
I'm not even sure that you could really chunk pure DTBook as you need to use the <levelx> elements (http://www.daisy.org/z3986/structure...ts.html#level1).
Hadrien is offline   Reply With Quote
Old 03-29-2009, 09:47 PM   #43
cerement
Groupie
cerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it iscerement knows what time it is
 
cerement's Avatar
 
Posts: 170
Karma: 2000
Join Date: Apr 2008
Location: San José, CA
Device: Amazon Kindle 1, Sony PRS-300, Amazon Kindle 3
Quote:
Originally Posted by Hadrien View Post
I'm not even sure that you could really chunk pure DTBook as you need to use the <levelx> elements (http://www.daisy.org/z3986/structure...ts.html#level1).
From what I've been able to pull out of the DAISY 3 Structure Guidelines, you can have multipart books using DTBook, each part still has to have the necessary headers and the "book" is logically segmented (you can't just randomly segment a file into arbitrary chunks).

Ex.
Code:
<dtbook>
    <head>
    </head>
    <book>
        <bodymatter>
            <level1 class="chapter">
                <h1>Chapter 1</h2>
                <p>(paragraph tags or any block level element)</p>
            </level1>
        </bodymatter>
    </book>
</dtbook>
Code:
<dtbook>
    <head>
    </head>
    <book>
        <bodymatter>
            <level1 class="chapter">
                <h1>Chapter 2</h2>
                <p>(paragraph tags or any block level element)</p>
            </level1>
        </bodymatter>
    </book>
</dtbook>
cerement is offline   Reply With Quote
Old 04-01-2009, 12:00 PM   #44
Kirtai
Addict
Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.Kirtai ought to be getting tired of karma fortunes by now.
 
Posts: 304
Karma: 2454436
Join Date: Sep 2008
Device: PRS-505, PRS-650, iPad, Samsung Galaxy SII (JB), Google Nexus 7 (2013)
From the sounds of things, it seems that the best approach for now might be to simply pick a suitable format (e.g. DTBook for linear text, DITA for topic oriented, or DITA/TEI-lite for all) and learn enough XSLT to be able to convert them into whatever godformat that's eventually settled on.
Does that about cover it?
Kirtai is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Ebook in PRC format will not convert to any other format Katelyn Calibre 0 10-01-2010 07:02 PM
I have a Kindle, can I order books other than Multi-format chilady1 Amazon Kindle 3 01-19-2010 04:46 PM
fictionwise multi-format... except .mobi demoric Amazon Kindle 4 10-02-2009 12:05 PM
Multi-format Reader theplotthickens Which one should I buy? 5 05-04-2009 03:19 AM
Proposal for an open source multi-format ebook authoring tool Jon Noring News 15 09-12-2008 12:17 PM


All times are GMT -4. The time now is 03:02 PM.


MobileRead.com is a privately owned, operated and funded community.