A few things:
* MobiPocket is an old format, derived from HTML2 with some extensions. In HTML2 times, there was no !DOCTYPE, and in any case there is no need in MobiPocket to differentiate between document languages (because there is only one), so you shouldn't expect it to be there. In fact, quite a bit of what mobigen/kindlegen does is to convert HTML4 and XHTML to HTML2 by rewriting tags and flattening CSS into old-style tags.
* <guide> is one of the extensions. Basically they took an entire chunk of the .opf file and stuck it in the <head> tag so that devices could generate menus to navigate to parts of the document. There are historical reasons for doing it this way, originating with MobiPocket's predecessor formats, which were basically just one big HTML document wrapped in a Palm database file. There are many other ways this could have been done, but creating multiple files/streams within the Palm database would get awkward for several reasons, not least of all because links are all flattened to absolute file positions.
* mobigen/kindlegen specifically removes line breaks to make the file smaller, so you shouldn't expect to see any.
Honestly, MobiPocket is such a crappy format that I would strongly advise avoiding it at all costs, with the sole exception of using it as an output format to display on a Kindle. For all other purposes, you should use ePub. I only wrote the original mobiunpack.py because I tried to decompress the dictionary with other tools, it took more than 30 minutes, and I wanted to demonstrate that it could be done much better (even in Python).
Last edited by adamselene; 10-15-2010 at 02:12 AM.