MobileRead Forums - View Single Post - The Mobipocket format: Starring Leonardo diCaprio and Kate Winslet

schmidt349 · 11-25-2007, 01:51 AM

The problem is not that it's hard to write a Mobipocket parser for any given platform. Using XUL to render the content and grep/XSLT to transform and display the old Mobipocket content seamlessly would take me all of a week. I've already written a Perl program that "unzips" Mobipocket files into a directory structure and rewrites all the nonstandard links in the document so you can just view it in a Web browser, provided of course you can find one that speaks HTML 2.0 (hint: Safari doesn't, at least not properly).

The problem lies in the fact that everyone seems to think it's alright to continue to author content in a format that has enormous limitations tied to the fact that it relies on a firmware data format that was never intended to be used the way it is now.

Seriously, think about it. Would you support an ebook format that wrapped data in an Apple II disk image or a Super Nintendo ROM? If you use .mobi you're doing something exactly parallel.

What's worse, the compression format they use is some ancient undocumented Palm thing; the only reason my program can read it at all is because of the work of a kind soul on CPAN who wrote a Perl script that can decode and parse it. Without that it would probably have taken me weeks to write a specification and implement it properly. I'm a busy man; I don't have 40-hour weeks to spend staring at Mobipocket's idea of a joke.

The .epub format is trivial to write an interpreter for. It just has XHTML documents for the text itself, support files (CSS, JS, images in JPEG and GIF), and an easy-to-read XML manifest for the whole thing, all of it wrapped in a ZIP container. All totally industry-standard and the very same stuff we've been running the Web on since 2000.

So why hasn't Amazon pledged to support it in the Kindle?

What I really wanted to do was convert public-domain XML/SGML versions of ancient Greek and Latin texts into a format that the Kindle could understand so I don't have to carry irreplaceable books around with me. The former's been nixed by the Kindle's complete lack of UTF-8 support (precipitated in part, I shouldn't wonder, by the limitations of the Palm database format) and the latter just doesn't seem worth the effort considering that the conversion would deadend.

11-25-2007, 01:51 AM	#13
schmidt349 Member Posts: 20 Karma: 65 Join Date: Nov 2007 Device: Amazon Kindle	The problem is not that it's hard to write a Mobipocket parser for any given platform. Using XUL to render the content and grep/XSLT to transform and display the old Mobipocket content seamlessly would take me all of a week. I've already written a Perl program that "unzips" Mobipocket files into a directory structure and rewrites all the nonstandard links in the document so you can just view it in a Web browser, provided of course you can find one that speaks HTML 2.0 (hint: Safari doesn't, at least not properly). The problem lies in the fact that everyone seems to think it's alright to continue to author content in a format that has enormous limitations tied to the fact that it relies on a firmware data format that was never intended to be used the way it is now. Seriously, think about it. Would you support an ebook format that wrapped data in an Apple II disk image or a Super Nintendo ROM? If you use .mobi you're doing something exactly parallel. What's worse, the compression format they use is some ancient undocumented Palm thing; the only reason my program can read it at all is because of the work of a kind soul on CPAN who wrote a Perl script that can decode and parse it. Without that it would probably have taken me weeks to write a specification and implement it properly. I'm a busy man; I don't have 40-hour weeks to spend staring at Mobipocket's idea of a joke. The .epub format is trivial to write an interpreter for. It just has XHTML documents for the text itself, support files (CSS, JS, images in JPEG and GIF), and an easy-to-read XML manifest for the whole thing, all of it wrapped in a ZIP container. All totally industry-standard and the very same stuff we've been running the Web on since 2000. So why hasn't Amazon pledged to support it in the Kindle? What I really wanted to do was convert public-domain XML/SGML versions of ancient Greek and Latin texts into a format that the Kindle could understand so I don't have to carry irreplaceable books around with me. The former's been nixed by the Kindle's complete lack of UTF-8 support (precipitated in part, I shouldn't wonder, by the limitations of the Palm database format) and the latter just doesn't seem worth the effort considering that the conversion would deadend.