Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 11-12-2009, 01:50 PM   #1
adamselene
Enthusiast
adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.
 
Posts: 39
Karma: 11036
Join Date: Nov 2009
Device: Kindle Paperwhite, Kindle Touch, Kindle 2
KindleUnpack (MobiUnpack): Extracts text, images and metadata from Kindle/Mobi files

Most of this post now by pdurrant.

KindleUnpack is a set of python scripts that take a Kindle/Mobipocket ebook and extracts the HTML, images and metadata contained in the ebook, and puts them in a form suitable for passing to KindleGen.

For KF8 files and combined Mobipocket and KF8 files, it also can produce separated mobipocket and KF8 files, and also the original source files if those are included in the ebook. In addition, for KF8 files it can produce an 'ePub', although if the HTML isn't compliant with ePub standards, the 'ePub' won't be either.

For Amazon's .azw4 files, it will extract the PDF that's been wrapped up in Amazon's .azw4 file format.

Downloads available:
Version 0.75 of the python scripts (including .pyw graphics front end)
Version 0.75 of a drag&drop AppleScript version.
A calibre plugin version of the scripts is available in this thread.

For anyone not interested in KindeGen and KF8, there's a copy of the last version of the single-file script, mobiunpack 0.32.

The name of the script was changed to KindleUnpack with version 0.6.1.

The Python scripts are released under GPLv3. The AppleScript Wrapper is released with unlicense.

Many thanks to adamselene for the base code which has been built on by many of the participants of this thread.

pdurrant



[Original Post:]
I reimplemented huff/cdic compression in Python, and did a few other things while I was at it. The new script:

* decompresses about 25x faster than mobihuff.py
* uses much less memory (about 16x on my largest test file)
* implements conversion of uncompressed and Palmdoc-compressed files
* handles trailing data correctly in all cases

Check it out: http://www.mit.edu/afs/athena/user/m.../mobiunpack.py

PLEASE NOTE that this tool is only for decompressing unencrypted Mobipocket files. It does not decrypt DRMed files. Do not ask me for help breaking DRM.
Attached Files
File Type: zip mobiunpack 32.py.zip (18.4 KB, 6047 views)
File Type: zip KindleUnpack v0.75.app.zip (431.7 KB, 32 views)
File Type: zip KindleUnpack_v075.zip (93.9 KB, 41 views)

Last edited by pdurrant; 09-14-2014 at 04:02 AM. Reason: Updated to v0.75
adamselene is offline   Reply With Quote
Old 11-13-2009, 11:22 PM   #2
adamselene
Enthusiast
adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.
 
Posts: 39
Karma: 11036
Join Date: Nov 2009
Device: Kindle Paperwhite, Kindle Touch, Kindle 2
The latest version (0.07, same location) is even faster—now about 50x as fast as mobihuff.py.

Last edited by adamselene; 11-14-2009 at 12:02 AM.
adamselene is offline   Reply With Quote
Old 11-14-2009, 07:20 AM   #3
quocsan
Member
quocsan began at the beginning.
 
Posts: 19
Karma: 10
Join Date: Jul 2009
Device: none
Great job!
Thank you, Adamselene.
quocsan is offline   Reply With Quote
Old 11-15-2009, 09:04 PM   #4
HansTWN
Wizard
HansTWN ought to be getting tired of karma fortunes by now.HansTWN ought to be getting tired of karma fortunes by now.HansTWN ought to be getting tired of karma fortunes by now.HansTWN ought to be getting tired of karma fortunes by now.HansTWN ought to be getting tired of karma fortunes by now.HansTWN ought to be getting tired of karma fortunes by now.HansTWN ought to be getting tired of karma fortunes by now.HansTWN ought to be getting tired of karma fortunes by now.HansTWN ought to be getting tired of karma fortunes by now.HansTWN ought to be getting tired of karma fortunes by now.HansTWN ought to be getting tired of karma fortunes by now.
 
Posts: 4,540
Karma: 264065402
Join Date: Jun 2009
Location: Taiwan
Device: HP Touchpad, Sony Duo 13, Lumia 920, Kobo Aura HD
time to get working on those Topaz files! Wink, wink!
HansTWN is offline   Reply With Quote
Old 02-05-2010, 12:37 PM   #5
pdurrant
The Grand Mouse
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 31,678
Karma: 87823216
Join Date: Jul 2007
Location: Norfolk, England
Device: NOOK ST GlowLight
Quote:
Originally Posted by adamselene View Post
PLEASE NOTE that this tool is only for decompressing unencrypted Mobipocket files. It does not decrypt DRMed files. Do not ask me for help breaking DRM.
Many thanks for this. I have moved the latest versions into the first post in this thread now. (Being a moderator has some advantages.)

Last edited by pdurrant; 07-26-2012 at 04:18 AM. Reason: Moved files.
pdurrant is offline   Reply With Quote
Old 02-05-2010, 01:33 PM   #6
soalla
Member
soalla began at the beginning.
 
Posts: 18
Karma: 10
Join Date: Apr 2008
Device: iPod Touch, Sony PRS-505
thanks to both of you!!
soalla is offline   Reply With Quote
Old 02-05-2010, 06:05 PM   #7
pdurrant
The Grand Mouse
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 31,678
Karma: 87823216
Join Date: Jul 2007
Location: Norfolk, England
Device: NOOK ST GlowLight
I've now tweaked the script to also output the images.

Note that the HTML file is the raw contents of the Mobipocket file, and so the img attributes in it aren't proper HTML, and don't point to the extracted images. To get working images in the HTML, a bit of search/replace will be needed, although it should be possible to do it with a single grep, as I've tried to make the file names easy to use with what's in the HTML file.
pdurrant is offline   Reply With Quote
Old 02-06-2010, 04:04 AM   #8
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,142
Karma: 4792399
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Pssst, remove the __MACOSX directory
Jellby is offline   Reply With Quote
Old 02-06-2010, 10:54 AM   #9
pdurrant
The Grand Mouse
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 31,678
Karma: 87823216
Join Date: Jul 2007
Location: Norfolk, England
Device: NOOK ST GlowLight
Quote:
Originally Posted by Jellby View Post
Pssst, remove the __MACOSX directory
OK, done. Saved 48 bytes!
pdurrant is offline   Reply With Quote
Old 02-09-2010, 03:03 PM   #10
pdurrant
The Grand Mouse
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 31,678
Karma: 87823216
Join Date: Jul 2007
Location: Norfolk, England
Device: NOOK ST GlowLight
Tweaked again, mostly by some_updates from the Dark Reverser's blog comments, to output some of the metadata from the file.

I've added to his work by getting the metadata output as an opf file resembling the original file used to generate the Mobipocket file.

However, the raw output of the 'html' in the Mobipocket file need a fair bit of work on it yet before it'll be possible to regenerate the file using Mobipocket Creator or KindleGen.

That's my eventual aim with this, however.
pdurrant is offline   Reply With Quote
Old 02-18-2010, 10:46 AM   #11
pdurrant
The Grand Mouse
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 31,678
Karma: 87823216
Join Date: Jul 2007
Location: Norfolk, England
Device: NOOK ST GlowLight
Quote:
Originally Posted by pdurrant View Post
However, the raw output of the 'html' in the Mobipocket file need a fair bit of work on it yet before it'll be possible to regenerate the file using Mobipocket Creator or KindleGen.

That's my eventual aim with this, however.
By taking code from other sources and tweaking it, Version 0.17 (above) now creates an opf file, a folder of images, and an html file that are ready for use with Mobipocket Creator.

Simply opening the opf file with Mobipocket Creator, and choosing to build re-creates the original Mobipocket book. Of course, it also means that it's easy to correct any typos in the HTML file first.
pdurrant is offline   Reply With Quote
Old 02-19-2010, 05:33 AM   #12
quocsan
Member
quocsan began at the beginning.
 
Posts: 19
Karma: 10
Join Date: Jul 2009
Device: none
Quote:
Originally Posted by pdurrant View Post
By taking code from other sources and tweaking it, Version 0.17 (above) now creates an opf file, a folder of images, and an html file that are ready for use with Mobipocket Creator.

Simply opening the opf file with Mobipocket Creator, and choosing to build re-creates the original Mobipocket book. Of course, it also means that it's easy to correct any typos in the HTML file first.
Thank you pdurrant!
You have done a great job!
B.T.W, Could you please make a small Python script that can change eBook metadata (e.g eBook's title)?
Sometimes, we need to change the titles for grouping eBooks.
Thank you in advance for your attention.
quocsan is offline   Reply With Quote
Old 02-19-2010, 12:30 PM   #13
mbovenka
Wizard
mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.
 
Posts: 1,179
Karma: 1619681
Join Date: Oct 2007
Location: Almere, The Netherlands
Device: Kobo Aura HD
Quote:
Originally Posted by quocsan View Post
Thank you pdurrant!
You have done a great job!
B.T.W, Could you please make a small Python script that can change eBook metadata (e.g eBook's title)?
Sometimes, we need to change the titles for grouping eBooks.
Thank you in advance for your attention.
If you have Calibre installed, 'ebook-meta' will do what you want. If you haven't, you should :-).
mbovenka is offline   Reply With Quote
Old 02-19-2010, 05:57 PM   #14
pdurrant
The Grand Mouse
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 31,678
Karma: 87823216
Join Date: Jul 2007
Location: Norfolk, England
Device: NOOK ST GlowLight
Quote:
Originally Posted by quocsan View Post
Thank you pdurrant!
You have done a great job!
B.T.W, Could you please make a small Python script that can change eBook metadata (e.g eBook's title)?
Sometimes, we need to change the titles for grouping eBooks.
Thank you in advance for your attention.
I use Mobiperl for that (with a little Applescript wrapper).
pdurrant is offline   Reply With Quote
Old 02-19-2010, 10:27 PM   #15
quocsan
Member
quocsan began at the beginning.
 
Posts: 19
Karma: 10
Join Date: Jul 2009
Device: none
Quote:
Originally Posted by pdurrant View Post
I use Mobiperl for that (with a little Applescript wrapper).
I see. But I meant title in UNICODE (eBook => 'Sách Điện Tử' in Vietnamese).
MobiPerl cannot deal with title in UNICODE.
I have changed eBooks' titles with WinHex. But I dislike to do that by hand.
OK, I'll try with ... Google.
Thank you for attention.
quocsan is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Can i rotate text and insert images in Mobi and EPUB? JanGLi Kindle Formats 5 02-02-2013 04:16 PM
PDF to Mobi with text and images pocketsprocket Kindle Formats 7 05-21-2012 07:06 AM
Mobi files - images DWC Introduce Yourself 5 07-06-2011 01:43 AM
pdf to mobi... creating images rather than text Dumhed Calibre 5 11-06-2010 12:08 PM
Transfer of images on text files anirudh215 PDF 2 06-22-2009 09:28 AM


All times are GMT -4. The time now is 05:36 PM.


MobileRead.com is a privately owned, operated and funded community.