View Single Post
Old 01-12-2012, 12:48 PM   #242
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,878
Karma: 6120478
Join Date: Nov 2009
Device: many
Hi Liz,

It does unpack and generate things so that the end user could edit the files and drop them back on kindlegen to recreate a modified mobi.

The new kindlegen creates mobis (palm database files) that actually have two completely different versions of the ebook inside it (and I am not referring to the kindlegensrc.zip which may also stored there).

The first is the original mobi format ebook and immeidately after it is the new K8 mobi ebook all stored in the same .mobi palm database file.

So older technology can read the .mobi file from the top and see it as a normal mobi. Newer technology can then detect that this is a compound mobi file and actually open the second half which is the K8 formatted (html5 - basically a variation of an epub) to get all of the new features.

Right now, mobi_unpack.py will create in the output folder the following:

1. from the old part of the .mobi it will create the source mobi markup (old html) and images that will allow the user to edit it any way they want and drop it back on kindlegen.

2. if the kindlegensrc.zip record is present it will unpack it so that the user can see the actual source ebook file (typically an epub) given to kindlegen. This record is typically removed by Amazon but is actually created by Kindlegen.

3. from the K8 version of the .mobi, it will create the K8 folder and inside it all of the images and fonts, and xhtml source files that were used to create it. A user who did not have access to the kindlegensrc.zip could edit this and then drop it on kindlegen to create a new/altered version of the ebook (fix typos, etc).

4. From the K8 pieces, it actually will build a complete epub which is stores as well.

You can then compare the epub created from the K8 against the kindlegensrc.zip (typically an epub) to see what is anything the kindlegen processing changed.

All of this requires rebuilding and generation. The actual binary format inside of the mobi file needs to be decoded to make something that is usable in some way. If you want to see what the actual raw files look like, you can use NotePad+ or any good text editor to change one line near the top of mobi_unpack.py that will write out all of the raw text pieces as well.

So it is simply not something that dumps sections from the palm database file. It actually does that (the raw file) and then rebuilds it to try to get back to the original source so that authors and people can more easily edit their books and recreate mobi output using Kindlegen.

It is also useful for understanding the internal format of the new .k8 mobis and what if any tags are created and used.

If you have any other questions just ask.

Take care,

Kevin




Quote:
Originally Posted by lizcastro View Post
Thanks, Kevin! This is so helpful.

Can you confirm that the only thing mobi_unpack does is show what was in the mobi file? It doesn't generate anything, right?

When I convert an EPUB file to mobi with KindleGen2, and then unpack it with your latest version of mobi_unpack, I get a folder that contains a smaller version of the EPUB file than the original, an HTML file with what looks like the contents of the entire book, along with an ncx and opf file, and a folder with reduced size images.

Then, there's a K8 folder that contains a completely re-engineered set of files, all renamed, resized images, etc. of what was originally in my EPUB file.

And then there's a kindlegensrc.zip file, that when unzipped, contains my original unaltered files.

It all seems so excessive.

thanks,
Liz
KevinH is online now   Reply With Quote