MobileRead Forums - View Single Post - What format to store books in? What software to read them with?

nairbv · 01-03-2008, 01:00 AM

@recycledelectron:

So what's your opinion on epub? ... a file format that is sometimes a zip file containing XHTML, and is sometimes a zip file containing a DTBook? ... and then maybe in addition an "it's preferred if you use this xml document if you know how to parse it" other option?

A file format who's rendering will be handled by css when displayed in a web browser, but by an adobe specific file called page-template.xpgt when displayed by the primary currently existing "epub compliant" software.

@kovidgoyal:

Sure, it *should* go in the opf file. ... but if converted from html, the metadata will probably also be in the html file. if converted from a dtbook, it will probably also be in the dtbook. if converted lazily, which will often enough be the case, it might not have been copied into the opf file.

When converting from epub to html, most people will just pull out the html file and think "i'm done," ... and thus ideal epub authoring software would put the data in both places when creating the epub file initially. Often enough, buggy software cuts off some string somewhere at some number of characters, and so even just minor things like sporadic poorly written software will mean that two versions of metadata won't match.

*Good* reader software would probably check html and/or dtbook metadata when it fails to find all metadata in the opf file... since, after all, it might be there, and why miss data?

I see these as unnecessary complications introduced by a poorly thought out design. I'm just saying that I would prefer a solution that only stores semantic data once. For me, much of the point of moving to a single format is to reduce redundancy. If the file I'm converting to maintains redundancy, then there's no reason for me to bother.

01-03-2008, 01:00 AM	#56
nairbv Connoisseur Posts: 88 Karma: 15 Join Date: Nov 2007 Device: still looking for an ebook reader device	@recycledelectron: So what's your opinion on epub? ... a file format that is sometimes a zip file containing XHTML, and is sometimes a zip file containing a DTBook? ... and then maybe in addition an "it's preferred if you use this xml document if you know how to parse it" other option? A file format who's rendering will be handled by css when displayed in a web browser, but by an adobe specific file called page-template.xpgt when displayed by the primary currently existing "epub compliant" software. @kovidgoyal: Sure, it should go in the opf file. ... but if converted from html, the metadata will probably also be in the html file. if converted from a dtbook, it will probably also be in the dtbook. if converted lazily, which will often enough be the case, it might not have been copied into the opf file. When converting from epub to html, most people will just pull out the html file and think "i'm done," ... and thus ideal epub authoring software would put the data in both places when creating the epub file initially. Often enough, buggy software cuts off some string somewhere at some number of characters, and so even just minor things like sporadic poorly written software will mean that two versions of metadata won't match. Good reader software would probably check html and/or dtbook metadata when it fails to find all metadata in the opf file... since, after all, it might be there, and why miss data? I see these as unnecessary complications introduced by a poorly thought out design. I'm just saying that I would prefer a solution that only stores semantic data once. For me, much of the point of moving to a single format is to reduce redundancy. If the file I'm converting to maintains redundancy, then there's no reason for me to bother.