09-25-2015, 03:27 PM | #1186 |
Zealot
Posts: 128
Karma: 500
Join Date: Aug 2011
Device: kindle, boox
|
Hi, I've one question related to python in fact but to kindleunpack in some way.
I want to use the azw3 save file feature to strip the azw3 from a kindlegen created combi mobi. I modified the kindleunpack.py code a bit and it works really fast, now i get an azw3 file in the output dir (i want to save only the azw3 not the rest of the unpack work). Well my problem is when i call from msdos kindleunpack.py from a directory it's unable to find the import modules, only when i'm located in the lib directory works. I'm very bad with python, is there anyway to import modules without adding them to the path in windows and calling the script from other directory? Example C:\kindlegen\kindleunpack\lib\kindleunpack.py If my directory is c:\kindlegen\ And i use python kindleunpack\lib\kindleunpack.py the result is ImportError: No module named compatibility_utils If the directory is C:\kindlegen\kindleunpack\lib\ works find python finds the rest of the modules. Thanks. PS: I'm trying to do a batch to convert books with kindlegen but without the extrasize from the old mobi format. In fact it would be great if a calibre plugin conversion coul be done using kindlegen, striping the azw3, modifying parameters to see it as a normal document. But i don't know if it's possible to bridge calibre azw3 conversion and i'm really bad with python, so calibre plugin is perhaps betond my capabilities so i'll try the easy way with a batch in msdos. |
12-22-2015, 08:09 AM | #1187 |
Zealot
Posts: 107
Karma: 10
Join Date: Feb 2015
Location: India
Device: Kindle PW3
|
Is it possible to get the standalone version running on Android? Python is available for Android in the form of QPython.
|
12-22-2015, 04:01 PM | #1188 |
Sigil Developer
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Since I don't own anything running android, I doubt it very much. Have you tried simply moving the python code over and trying?
|
12-23-2015, 06:04 AM | #1189 |
Zealot
Posts: 107
Karma: 10
Join Date: Feb 2015
Location: India
Device: Kindle PW3
|
|
12-23-2015, 06:51 AM | #1190 |
Member
Posts: 16
Karma: 10
Join Date: Oct 2012
Device: Kindle 4
|
I have a mobi which KindleUnpack v0.80 is not able to unpack. I tried too with the latest version at GitHub. This is what happens:
Code:
$ ./kindleunpack.py book.mobi KindleUnpack v0.80 Based on initial mobipocket version Copyright © 2009 Charles M. Hannum <root@ihack.net> Extensive Extensions and Improvements Copyright © 2009-2014 by: P. Durrant, K. Hendricks, S. Siebert, fandrieu, DiapDealer, nickredding, tkeo. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 3. Unpacking Book... Palm DB type: BOOKMOBI, 118 sections. Error: 'utf8' codec can't decode byte 0xe8 in position 33: invalid continuation byte Traceback (most recent call last): File "./kindleunpack.py", line 1004, in main unpackBook(infile, outdir, apnxfile, epubver, use_hd) File "./kindleunpack.py", line 878, in unpackBook mh = MobiHeader(sect,0) File "./mobi_header.py", line 524, in __init__ self.parseMetaData() File "./mobi_header.py", line 818, in parseMetaData addValue(name, content.decode(codec)) File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode byte 0xe8 in position 33: invalid continuation byte Last edited by elmimmo; 12-23-2015 at 06:53 AM. Reason: Added environment details. |
12-23-2015, 08:27 AM | #1191 |
The Grand Mouse 高貴的老鼠
Posts: 71,510
Karma: 306214458
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
|
|
12-23-2015, 08:59 AM | #1192 | |
Grand Sorcerer
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
2. Do you have access to the original source files? |
|
12-23-2015, 10:37 AM | #1193 |
Sigil Developer
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Or alternatively since this is a metadata issue, please try running the latest version of DumpMobiHeader on it and posting the results here. It may similarly error out but the output will tell us what encoding the book is supposed to be using, version, etc.
KevinH |
12-23-2015, 10:03 PM | #1194 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
@KevinH: This is almost certainly caused by an issue with the trailing bytes at the end of every text record. There were (long ago) versions of the dedrm tool that used to produce de-drmed mobi files with corrupted headers (extra data flag set to zero). In such files you can end up with text that contains partial utf-8 byte sequences.
|
12-24-2015, 09:05 AM | #1195 | |
Member
Posts: 16
Karma: 10
Join Date: Oct 2012
Device: Kindle 4
|
The book is in Spanish, so at most it will have things like accented vowels or so. I do not have access to its source.
Quote:
Code:
DumpMobiHeader book.mobi .MOBI First Header Dump from Section 0 Header Version is: 0x6 Header start position is: 0x0 Header Length is: 0x100 Field: compression_type Offset: 0x000 Width: 2 Value: 0x02 Field: fill0 Offset: 0x002 Width: 2 Value: 0x00 Field: text_length Offset: 0x004 Width: 4 Value: 0xba34 Field: text_records Offset: 0x008 Width: 2 Value: 0x0c Field: max_section_size Offset: 0x00a Width: 2 Value: 0x1000 Field: crypto_type Offset: 0x00c Width: 2 Value: 0x00 Field: fill1 Offset: 0x00e Width: 2 Value: 0x00 Field: magic Offset: 0x010 Width: 4 Value: MOBI Field: header_length Offset: 0x014 Width: 4 Value: 0x0100 Field: type Offset: 0x018 Width: 4 Value: 0x0002 Field: codepage Offset: 0x01c Width: 4 Value: 0xfde9 Field: unique_id Offset: 0x020 Width: 4 Value: 0x5daedfaf Field: version Offset: 0x024 Width: 4 Value: 0x0006 Field: metaorthindex Offset: 0x028 Width: 4 Value: 0xffffffff Field: metainflindex Offset: 0x02c Width: 4 Value: 0xffffffff Field: index_names Offset: 0x030 Width: 4 Value: 0xffffffff Field: index_keys Offset: 0x034 Width: 4 Value: 0xffffffff Field: extra_index0 Offset: 0x038 Width: 4 Value: 0xffffffff Field: extra_index1 Offset: 0x03c Width: 4 Value: 0xffffffff Field: extra_index2 Offset: 0x040 Width: 4 Value: 0xffffffff Field: extra_index3 Offset: 0x044 Width: 4 Value: 0xffffffff Field: extra_index4 Offset: 0x048 Width: 4 Value: 0xffffffff Field: extra_index5 Offset: 0x04c Width: 4 Value: 0xffffffff Field: first_nontext Offset: 0x050 Width: 4 Value: 0x000e Field: title_offset Offset: 0x054 Width: 4 Value: 0x02b4 Field: title_length Offset: 0x058 Width: 4 Value: 0x0010 Field: language_code Offset: 0x05c Width: 4 Value: 0x040a Field: dict_in_lang Offset: 0x060 Width: 4 Value: 0x0000 Field: dict_out_lang Offset: 0x064 Width: 4 Value: 0x0000 Field: min_version Offset: 0x068 Width: 4 Value: 0x0006 Field: first_addl_offset Offset: 0x06c Width: 4 Value: 0x0011 Field: huff_offset Offset: 0x070 Width: 4 Value: 0x0000 Field: huff_num Offset: 0x074 Width: 4 Value: 0x0000 Field: huff_tbl_offset Offset: 0x078 Width: 4 Value: 0x0000 Field: huff_tbl_len Offset: 0x07c Width: 4 Value: 0x0000 Field: exth_flags Offset: 0x080 Width: 4 Value: 0x1850 Field: fill3_a Offset: 0x084 Width: 4 Value: 0x0000 Field: fill3_b Offset: 0x088 Width: 4 Value: 0x0000 Field: fill3_c Offset: 0x08c Width: 4 Value: 0x0000 Field: fill3_d Offset: 0x090 Width: 4 Value: 0x0000 Field: fill3_e Offset: 0x094 Width: 4 Value: 0x0000 Field: fill3_f Offset: 0x098 Width: 4 Value: 0x0000 Field: fill3_g Offset: 0x09c Width: 4 Value: 0x0000 Field: fill3_h Offset: 0x0a0 Width: 4 Value: 0x0000 Field: drm_offset Offset: 0x0a8 Width: 4 Value: 0xffffffff Field: drm_count Offset: 0x0ac Width: 4 Value: 0x0000 Field: drm_size Offset: 0x0b0 Width: 4 Value: 0x0000 Field: drm_flags Offset: 0x0b4 Width: 4 Value: 0x0000 Field: fill4_a Offset: 0x0b8 Width: 4 Value: 0x0000 Field: fill4_b Offset: 0x0bc Width: 4 Value: 0x0000 Field: first_content Offset: 0x0c0 Width: 2 Value: 0x01 Field: last_content Offset: 0x0c2 Width: 2 Value: 0x4c Field: unknown0 Offset: 0x0c4 Width: 4 Value: 0x0001 Field: fcis_offset Offset: 0x0c8 Width: 4 Value: 0x004e Field: fcis_count Offset: 0x0cc Width: 4 Value: 0x0001 Field: flis_offset Offset: 0x0d0 Width: 4 Value: 0x004d Field: flis_count Offset: 0x0d4 Width: 4 Value: 0x0001 Field: unknown1 Offset: 0x0d8 Width: 4 Value: 0x0000 Field: unknown2 Offset: 0x0dc Width: 4 Value: 0x0000 Field: srcs_offset Offset: 0x0e0 Width: 4 Value: 0x004f Field: srcs_count Offset: 0x0e4 Width: 4 Value: 0x0002 Field: unknown3 Offset: 0x0e8 Width: 4 Value: 0xffffffff Field: unknown4 Offset: 0x0ec Width: 4 Value: 0xffffffff Field: fill5 Offset: 0x0f0 Width: 2 Value: 0x00 Field: traildata_flags Offset: 0x0f2 Width: 2 Value: 0x03 Field: ncx_index Offset: 0x0f4 Width: 4 Value: 0x000e Field: unknown5 Offset: 0x0f8 Width: 4 Value: 0xffffffff Field: unknown6 Offset: 0x0fc Width: 4 Value: 0xffffffff Field: datp_offset Offset: 0x100 Width: 4 Value: 0xffffffff Field: unknown7 Offset: 0x104 Width: 4 Value: 0xffffffff Extra Region Length: 0x0 EXTH Region Length: 0x21ac EXTH MetaData Key: "Published" Value: "2012-08-2" Error: 'utf8' codec can't decode byte 0xe8 in position 33: invalid continuation byte |
|
12-24-2015, 09:19 AM | #1196 | |
Grand Sorcerer
Posts: 27,552
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
|
|
12-24-2015, 03:54 PM | #1197 |
Sigil Developer
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
|
So there is metadata item that is either binary data that we incorrectly try to interpret as string or improperly encoded string data.
The 0xfde9 value for codepage converts to 65001, which is utf-8. So someone probably incorrectly edited the metadata in this mobi (possibly trying to hide something for some reason). If you post it privately for me and pm me the link, I should be able to fix it. Have you tried loading it in calibre? Kovid's utf-8 decoding routines are most likely more robust than ours? We could also try modifying kindleunpack to try doing the decoding with replacement or ignoring utf-8 errors. KevinH |
12-24-2015, 04:14 PM | #1198 |
Sigil Developer
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi,
Pleaser try again with DumpMobiHeader_v021.py just pushed to my github. And try posting the output here again. If the problem is improperly encoding metadata, this should work around it and show us where the error might be occurring. We can then see if the bug is in KindleUnpack or in your particular mobi. Thanks, KevinH |
12-30-2015, 12:19 PM | #1199 | |
Grand Sorcerer
Posts: 27,552
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
The environment is probably too different for there to ever be a single code-base that worked for Lin/Win/Mac/QPython. It could probably be forked, though. |
|
01-05-2016, 10:37 AM | #1200 | |
Member
Posts: 16
Karma: 10
Join Date: Oct 2012
Device: Kindle 4
|
Just to keep everyone posted, the book KindleUnpack could not unpack had an error:
Quote:
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Can i rotate text and insert images in Mobi and EPUB? | JanGLi | Kindle Formats | 5 | 02-02-2013 04:16 PM |
PDF to Mobi with text and images | pocketsprocket | Kindle Formats | 7 | 05-21-2012 07:06 AM |
Mobi files - images | DWC | Introduce Yourself | 5 | 07-06-2011 01:43 AM |
pdf to mobi... creating images rather than text | Dumhed | Calibre | 5 | 11-06-2010 12:08 PM |
Transfer of images on text files | anirudh215 | 2 | 06-22-2009 09:28 AM |