|  01-12-2012, 12:45 PM | #241 | |
| The Grand Mouse 高貴的老鼠            Posts: 74,408 Karma: 318076944 Join Date: Jul 2007 Location: Norfolk, England Device: Kindle Oasis | Quote: 
 Yes, the output from the new KindleGen does contain the Mobipocket, KF8 and your source files, all wrapped up in one. | |
|   |   | 
|  01-12-2012, 12:48 PM | #242 | |
| Sigil Developer            Posts: 9,070 Karma: 6361556 Join Date: Nov 2009 Device: many | 
			
			Hi Liz, It does unpack and generate things so that the end user could edit the files and drop them back on kindlegen to recreate a modified mobi. The new kindlegen creates mobis (palm database files) that actually have two completely different versions of the ebook inside it (and I am not referring to the kindlegensrc.zip which may also stored there). The first is the original mobi format ebook and immeidately after it is the new K8 mobi ebook all stored in the same .mobi palm database file. So older technology can read the .mobi file from the top and see it as a normal mobi. Newer technology can then detect that this is a compound mobi file and actually open the second half which is the K8 formatted (html5 - basically a variation of an epub) to get all of the new features. Right now, mobi_unpack.py will create in the output folder the following: 1. from the old part of the .mobi it will create the source mobi markup (old html) and images that will allow the user to edit it any way they want and drop it back on kindlegen. 2. if the kindlegensrc.zip record is present it will unpack it so that the user can see the actual source ebook file (typically an epub) given to kindlegen. This record is typically removed by Amazon but is actually created by Kindlegen. 3. from the K8 version of the .mobi, it will create the K8 folder and inside it all of the images and fonts, and xhtml source files that were used to create it. A user who did not have access to the kindlegensrc.zip could edit this and then drop it on kindlegen to create a new/altered version of the ebook (fix typos, etc). 4. From the K8 pieces, it actually will build a complete epub which is stores as well. You can then compare the epub created from the K8 against the kindlegensrc.zip (typically an epub) to see what is anything the kindlegen processing changed. All of this requires rebuilding and generation. The actual binary format inside of the mobi file needs to be decoded to make something that is usable in some way. If you want to see what the actual raw files look like, you can use NotePad+ or any good text editor to change one line near the top of mobi_unpack.py that will write out all of the raw text pieces as well. So it is simply not something that dumps sections from the palm database file. It actually does that (the raw file) and then rebuilds it to try to get back to the original source so that authors and people can more easily edit their books and recreate mobi output using Kindlegen. It is also useful for understanding the internal format of the new .k8 mobis and what if any tags are created and used. If you have any other questions just ask. Take care, Kevin Quote: 
 | |
|   |   | 
|  01-12-2012, 01:10 PM | #243 | 
| Member   Posts: 16 Karma: 148 Join Date: Apr 2010 Device: iPad, NOOK, Kindle, Kobo | 
			
			Fascinating! Thanks so much for the info. And for mobi_unpack itself. I find the fact that the mobi file contains a non-KF8 version, a KF8 version AND the original EPUB particularly interesting. And I hate the way all the files get renamed! I assume that's KindleGen and not mobi_unpack. Are either of you on Twitter? I'd love to follow you. best, Liz | 
|   |   | 
|  01-12-2012, 02:08 PM | #244 | 
| Sigil Developer            Posts: 9,070 Karma: 6361556 Join Date: Nov 2009 Device: many | 
			
			Hi Liz, Inside the .mobi there are no file names at all. Each font, image, etc is just stored in section of the database (with no name info) and referred to from the processed html (i.e all links are converted to section numbers in the .mobi palm database). So all "names" are created by us (either based on the title) or simply numbered with img0001.jpg, font0002.ttf, part0004.xhtml, etc. We have no way of knowing what the original name was, whether it was a chapter, or section or .... That is the main reason we need to re-generate things. Even in the older mobis, the mobi markup html that was input to kindlegen was processed to remove links, store images in sections, etc, and so we must reverse that to get back to something that can be edited by users. As for twitter - I am too old to deal with anything new ;-) But I am sure Paul, or DiapDealer or any of the other contributors from this forum topic (mobi_unpacker is really the joint effort of a lot of people) would be happy to answer any questions. Take care, KevinH | 
|   |   | 
|  01-12-2012, 02:18 PM | #245 | 
| Member   Posts: 16 Karma: 148 Join Date: Apr 2010 Device: iPad, NOOK, Kindle, Kobo | 
			
			Whoa. I didn't realize. I sort of knew that mobi was this big mass of data, but didn't realize to what extent. So, if I understand correctly, mobi_unpack reverse engineers the mobi and then generates what the individual files would look like if they were individual files?  So it's not KindleGen that renames them, it's mobi_unpack, but it does so because it has no other choice, since the names are lost in the conversion to mobi? But the kindlegensrc.zip file actually comes from a real, existing EPUB that's sitting there in the mobi file created by KindleGen? Going to set WRITE_RAW_DATA to True now to see what happens. thanks! Liz | 
|   |   | 
|  01-12-2012, 02:31 PM | #246 | 
| Member   Posts: 16 Karma: 148 Join Date: Apr 2010 Device: iPad, NOOK, Kindle, Kobo | 
			
			Why does mobi_unpack generate an EPUB file?
		 | 
|   |   | 
|  01-12-2012, 02:45 PM | #247 | |
| Grand Sorcerer            Posts: 28,862 Karma: 207000000 Join Date: Jan 2010 Device: Nexus 7, Kindle Fire HD | Quote: 
 So since the original source won't be part of a commercially available, DRM-Free KF8 ebook, mobi_unpack decompiles the KF8 data into a familiar, standard, editable format that can be easily modified (or examined) with existing tools/programs and then fed right back to kindlegen. Last edited by DiapDealer; 01-12-2012 at 03:04 PM. | |
|   |   | 
|  01-12-2012, 03:07 PM | #248 | |
| Sigil Developer            Posts: 9,070 Karma: 6361556 Join Date: Nov 2009 Device: many | 
			
			Hi, Yes, exactly as DiapDealer said! It is nice to have the kindlegensrc.zip but ebooks downloaded from Amazon won't have that. Amazon strips it off (and if they keep it they could start selling epubs if they ever wanted to as well). So mobi_unpacker tries to recreate the original epub as close as it can be based on the K8 information (which is xhtml based with normal css that is essentially an epub with the main bits merged into one file with links replaced and a few other modifications). Take a look at the _k8.raw file in a text editor to see what the kindlegen actually stores inside. You can find the css info stored at the end (inline) with any svg moved to there as well. You can see how they have replaced links with base 32 numbered references, added their own aid="", etc. The mobi_unpacker figures out how to reverse all of that to get back to as close to an epub as possible since that is the input format for kindlegen. Take care, Kevin Quote: 
 | |
|   |   | 
|  01-12-2012, 03:11 PM | #249 | 
| Member   Posts: 16 Karma: 148 Join Date: Apr 2010 Device: iPad, NOOK, Kindle, Kobo | 
			
			Hmm. I don't see the _k8.raw file. When I used WRITE_RAW_DATA=True, the only thing I got different was a .rawml file, but it looks a lot like the .html file on the non-kf8 side. Should I have modified some other setting?
		 | 
|   |   | 
|  01-12-2012, 03:17 PM | #250 | 
| Sigil Developer            Posts: 9,070 Karma: 6361556 Join Date: Nov 2009 Device: many | 
			
			Hi, Look for a file inside the K8 directory that is named after the title of the book and ends with .rawml (I used to call it _k8.raw but then moved it to inside the K8 so that it would not impact the raw version from the older mobi part of the ebook). You should find the css at the end, links changed, aid="" placed in tags to augment the original id="", etc. For fun you can look at the .rawml version outside of the K8 directory. It is how the original mobi markup language got processed by kindlegen. Check out the links, how styles are inlined, etc. Last edited by KevinH; 01-12-2012 at 03:23 PM. | 
|   |   | 
|  01-12-2012, 03:42 PM | #251 | 
| Member   Posts: 16 Karma: 148 Join Date: Apr 2010 Device: iPad, NOOK, Kindle, Kobo | 
			
			I see. Interesting. Here's another question. If I'm selling mobi files directly, how do I get rid of the original EPUB? It seems like it would make the file unnecessarily large. | 
|   |   | 
|  01-12-2012, 03:53 PM | #252 | 
| Sigil Developer            Posts: 9,070 Karma: 6361556 Join Date: Nov 2009 Device: many | 
			
			Hi, I believe Paul has a kindlegensrc stripper someplace? Try searching this Mobi forum and you should find a thread about it. Ahh ... there is a KindleStrip program (next thread down I believe) that does what you want but it has not yet been updated to deal with the new kindlegen. I am sure someone here will soon patch it to make it work. And perhaps expand it to remove the K8 or older mobi parts as well. It makes no sense to ship so many copies of the ebook, it is just generating bloat. Last edited by KevinH; 01-12-2012 at 03:56 PM. | 
|   |   | 
|  01-12-2012, 04:05 PM | #253 | 
| Member   Posts: 16 Karma: 148 Join Date: Apr 2010 Device: iPad, NOOK, Kindle, Kobo | 
			
			Thanks!
		 | 
|   |   | 
|  01-12-2012, 05:26 PM | #254 | |
| The Grand Mouse 高貴的老鼠            Posts: 74,408 Karma: 318076944 Join Date: Jul 2007 Location: Norfolk, England Device: Kindle Oasis | Quote: 
 | |
|   |   | 
|  01-14-2012, 04:05 PM | #255 | 
| Sigil Developer            Posts: 9,070 Karma: 6361556 Join Date: Nov 2009 Device: many | 
				
				new version of experimental K8 mobi_unpack.py
			 
			
			Hi, I have had access to more samples (including the fixed layout Children's sample) and therefore have: - added support for image files used in CSS sheets (needed for fixed layout) - modified the unpacker to deal with the extra metadata fields used by fixed-layout ebooks "RegionMagnification", "fixed-layout", "book-type", "orientation-lock", "original-resolution" - identified the BOUNDARY section number - builds the epub from the K8 pieces with compression now - fixed the mobi_k8proc.py class code to be better encapsulated (added accessor methods) - fixed support for older mobis with no ncx So attached is the very latest version of the experimental mobi_unpack.py program. python ./mobi_unpack.py Jerome.mobi test/ PS: I have just updated the .zip attachment with all bug fixes I know about so far including some additional support for guide elements. PPS: I have again now updated the .zip attachment to support multiple @import url statements in css. PPPS: removed since DiapDealer has posted the latest version later on in this thread. Last edited by KevinH; 01-18-2012 at 10:28 AM. Reason: removed old zip DiapDealer has posted the latest version | 
|   |   | 
|  | 
| Thread Tools | Search this Thread | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Can i rotate text and insert images in Mobi and EPUB? | JanGLi | Kindle Formats | 5 | 02-02-2013 04:16 PM | 
| PDF to Mobi with text and images | pocketsprocket | Kindle Formats | 7 | 05-21-2012 07:06 AM | 
| Mobi files - images | DWC | Introduce Yourself | 5 | 07-06-2011 01:43 AM | 
| pdf to mobi... creating images rather than text | Dumhed | Calibre | 5 | 11-06-2010 12:08 PM | 
| Transfer of images on text files | anirudh215 | 2 | 06-22-2009 09:28 AM | |