Just tried this on a couple of auto-generated mobis made via the new version of Kindle Previewer (1.5).
It now has"ePub support", by which it means that it automatically converts any ePubs dragged upon it to mobi and drops the file in the same folder, apparently on the lower -c1 compression setting. Also a new simulation option for iPad, but no K3 mode yet. But the people trying to figure out Kindle Audio/Video now have a new testing tool for their efforts.
Anyway, the stripping works a treat and the extraction gives back almost exactly went in, as far as I can tell. Did a few more tests with my lazily assembled Fictionwise cleanup conversions and html comes back as zipped html, and a zipped up ePub in yields the exact same zipped-up ePub out.
Interestingly enough, if you originally pointed KindleGen at an opf (either custom or via unpacked epub), then no matter what the source structure, the unzipped-from-stripped version yields up the css, html, image, and misc (ncx, etc.) files rearranged into separate subdirectories with exactly those names.
Stripped file has immense space savings, often near-halving; sometimes more if there are a fair number of graphics involved in the source. Even pure text with no pictures is over a third smaller.
I have absolutely no idea why Amazon would remove the entirely logical -donotaddsource option unless they actually want to serve up plenty of bloated files via 3G and cut down on the marketable "Kindle can hold #### books!" space (and deduct extra from royalties paid out, of course), which seems rather counter-productive to me.
While we're on the subject of inexplicable KindleGen design decisions, might as well mention some more things I found out while using it:
- Plain old descendent selectors, a staple since CSS1, seem to be completely ignored. Another black mark for KindleGen's (lack of) CSS support and means that one will likely have to class every item one wants to target with a particular style not shared with its siblings, rather than classing a container parent element for the lot and letting specific descent rather than generic inheritance take place.
- If you forget to close a <div> with styling applied, all subsequent text seems to be rendered with the same styling, even if it occurs in separate files in the source, at least until it hits the next tag with a different style.
- If you have any superfluous tags in your NCX, even a mistakenly applied empty closing tag like say, </head>, then KindleGen will merrily ignore your painstakingly constructed <navMap> and happily build with nary a warning until you find out that your mobi has no chapter marks and spend far too long trying to figure out why.
Thanks again for writing this script! I'm sure people will be finding it very useful if Amazon's going to insist on always including the source files.