MobileRead Forums - View Single Post - KindleUnpack (MobiUnpack): Extracts text, images and metadata from Kindle/Mobi files

KevinH · 10-05-2014, 11:32 AM

Hi tkeo,

Thanks for testing this ....

Quote:

Originally Posted by tkeo

1. HDimage_test.mobi (an epub3 fixed layout ebook which I posted before)

Successfully unpacked with python 2; but with python 3, got an error message:

replacement = b'%s%s%s'%(osep, b'../Images/' + imageName, csep)
TypeError: can't concat bytes to str

This was a combination of problems

- no use of % to fold ascii or utf-8 strings into binary data (there is a pep on this)

- issues with iterating bytes and extracting single bytes from bytestrings, and there is a pep on this as well (pep 467) but nothing definite yet

But I have now fixed this.

Quote:

2. test2.awz3 (an epub2 reflowable ebook in English with several images)
Got errors with the both versions.

with python 2:

Spoiler:

with python 3:

Spoiler:

This is because I have not tried books with huffman cdic compression. I will generate a few test cases and see if I can track this down.

Quote:

3. kokoro.mobi (an epub3 rtl reflowable ebook in Japanese)
Unpacked as an epub2 ebook instead of the epub3 with the both versions.

Probably due to a comparison against a string constant where the variable be tested or the constant itself is bytestring and the variable in unicode or visa versa.

I have fixes for error 1 in the tree and I will track down and fix the huffman/cdic code with my own testcase. I will post an updated version once I have both errors fixed. Please keep trying them on as many test cases as you have so that we can exercise all of the code and track down these last issues.

Thanks,

Kevin