07-12-2014, 10:01 AM | #901 | |
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
Hi,
Quote:
Code:
<meta "property"="rendition:orientation">auto</meta> Thanks, |
|
07-12-2014, 10:03 AM | #902 |
Sigil Developer
Posts: 7,627
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi Paul and DiapDealer ( and anyone else with Kindle ebook collections ),
Do you own an Amazon ebook that uses EXTH 508, 517 or 522? Do you know who reverse engineered those EXTH values? One quick thing to try is to cd to your My Kindle Content directory and run DumpMobiHeader_v016 (or later) on the *.azw ebook files. It will dump the EXTH even if the ebook is DRM'd since the headers themselves are not encrypted. I then redirect all the output to a big text file and then use grep to find those tag values. I would love to know if any of those supposed file-as EXTH values are ever set. If do, I will try to grab a sample of that book to see if I can figure out how they were set and why? Thanks, KevinH |
Advert | |
|
07-12-2014, 10:11 AM | #903 |
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
|
07-12-2014, 10:48 AM | #904 |
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
Amazon ebook that uses EXTH 508, 517 or 522
Hi,
The sample of the following book has Creator file-as and Title file-as EXTH. The sample is no-DRM. Zenyaku Genji-Monogatari (Japanese Edition) ASIN: B00BHHKABO http://www.amazon.co.jp/%E5%85%A8%E8...89%A9%E8%AA%9E http://www.amazon.com/Zenyaku-Genji-...s=genji+kindle Thanks, |
07-12-2014, 11:17 AM | #905 |
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
Hi,
I have modified KindleUnpack v0.72z to fix bugs and to simplify the code. Except for 'refines' tags and excluding epub3 tags in epub2, I think I have done in my mind. Thanks, tkeo |
Advert | |
|
07-12-2014, 08:45 PM | #906 | |
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
Hi Kevin,
Quote:
I had seem somewhere on the internet that file-as meta were used as yomigana (or hurigana) to build an epub. So, I thought EXTH 508, 517 and 522 were corresponded to meta file-as of epub3. Now I have a guess that EXTH 508, 517 and 522 are converted from <meta name="???kana" content="XXXX"/> or <meta name="???gana" content="XXXX"/>. Thanks, |
|
07-12-2014, 11:53 PM | #907 |
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
Hi Kevin,
Here is another patch for KindleUnpack v0.72z. It includes the patch I posted before. In addition, It has More simplification of mobi_opf.py Addition of print message before makeEPUB() in kindleunpack.py Take care, tkeo |
07-13-2014, 09:33 AM | #908 |
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
faster mobi_split.py
Hi,
This is the faster version of mobi_split.py. I have removed the code for debug from which posted before. The comparison of processing performances is as follows, tested mobi file: 26MB 164 images original: 6.3s modified: 0.5s To Kevin, Please include this in the next official release if possible. Take care, tkeo |
07-13-2014, 08:28 PM | #909 |
Sigil Developer
Posts: 7,627
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi tkeo,
Have you tested the new mobi_split code to make sure that it is still building mobi7 and azw3 pieces completely correctly? I have not had time to look it over yet, but if you are sure, I will include it. I also have found a few more bugs in KindleUnpack that I will post a patch for either later tonight my time or tomorrow. I have changes for fixing the <image> tag in the svg mobi_cover to be a single type tag (similar to how the img tag is a single tag) ... it seems kindlegen requires that change; and changes in mobi_k8proc.py to both ignore meta tags and stop searching for id= or the older name= attributes when searching for a link target. In addition, I want to review the hasNCX variable, as some older mobi 4 versions (and older) do not have an ncx index. In the old days we simply did not create a toc.ncx for them, but somehow over the years that code got modified to always create a toc.ncx even though it will be empty. This will mean further code changes in the mobi_opf to deal with that remaining issue. I would like to fix that as well since your change seems to always believe this will be true but under odd circumstances, it won't be. I also want to remove the mistaken "file-as" EXTH values in mobi-header.py and set a few new values I have found so as not to confuse others who might use this code as the basis for their own. Hopefully, I will be able to release a stable version by Tuesday at the latest. Take care, KevinH |
07-13-2014, 10:38 PM | #910 |
Sigil Developer
Posts: 7,627
Karma: 5433388
Join Date: Nov 2009
Device: many
|
patch from v072z_test to hopefully v073
Hi tkeo,
Here is a patch that takes v072z_test up to v073 (hopefully!). It includes your latest cumulative patch as well as your faster mobi_split patch as well as a few minor bug fixes from my end as described in my previous post. I have also made a few things a bit more consistent in the mobi_header.py code and hopefully have dealt with CoverOffset's that are 0xffffffff as well (given your earlier post on that subject). I have decided not to play with the hasNCX stuff and not building a toc.ncx for older Mobi 4's until after the stable release as I didn't want to introduce changes that will break things. Please give it a good testing with all of your Amazon ebooks and let me know if you feel it is now ready for a stable release. If so, I will make the stable release Tuesday evening my time. Thanks! KevinH |
07-14-2014, 08:24 AM | #911 | |||
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
Hi Kevin,
Quote:
I have fixed a bug in taginfo_toxml() of mobi_k8resc.py and modified mobi_header.py. Quote:
508 : 'Unknown_Title_Furigana?_(508)', in dump_contexth(cpage, extheader).517 : 'Unknown_Creator_Furigana?_(517)', 522 : 'Unknown_Publisher_Furigana?_(522)', Those in class MobiHeader are not changed. Quote:
Code:
>>> int('0xffffffff') Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: invalid literal for int() with base 10: '0xffffffff' >>> BTW, prefs.py has CRLF line ending instead of LF. Take care, tkeo |
|||
07-14-2014, 10:00 AM | #912 | |
Sigil Developer
Posts: 7,627
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi tkeo,
Still don't like the comparison against sys.maxint as that changes with machine. I simply want to check for one specific missing value 0xffffffff as we do with the start offset later on in KindleUnpack and many places in the header. I will fix that. If it is some other invalid value, I want to know that and let the program barf appropriately so we figure out how they have changed setting of CoverOffset. I will add my fix to the dump EXTH code as well. Also, do you have a specific testcase you use with that? Thanks for catching the extra quotes bug in mobi_k8resc.py. I will remove the extra crs from prefs.py to keep it consistent with the other files. Edit: Here is how I am now handling the potentially missing CoverOffset issue (if that is what it even is). I am suspicious that someone has used an improperly written meta data editor and messed up the EXTH size fields somehow. If that is the case, I would rather we fail out as it will help us better detect where and when this is happening. From mobi_header.py in parseMetaData(self) Code:
if self.hasExth: extheader=self.exth _length, num_items = struct.unpack('>LL', extheader[4:12]) extheader = extheader[12:] pos = 0 for _ in range(num_items): id, size = struct.unpack('>LL', extheader[pos:pos+8]) content = extheader[pos + 8: pos + size] if id in MobiHeader.id_map_strings.keys(): name = MobiHeader.id_map_strings[id] addValue(name, unicode(content, codec).encode('utf-8')) elif id in MobiHeader.id_map_values.keys(): name = MobiHeader.id_map_values[id] if size == 9: value, = struct.unpack('B',content) addValue(name, str(value)) elif size == 10: value, = struct.unpack('>H',content) addValue(name, str(value)) elif size == 12: value, = struct.unpack('>L',content) # handle special case of missing CoverOffset if id != 201 or value != 0xffffffff: addValue(name, str(value)) else: print "Warning: Bad key, size, value combination detected in EXTH ", id, size, content.encode('hex') addValue(name, content.encode('hex')) KevinH Quote:
Last edited by KevinH; 07-14-2014 at 12:03 PM. |
|
07-14-2014, 11:11 AM | #913 | |
Grand Sorcerer
Posts: 27,545
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
|
|
07-15-2014, 08:16 AM | #914 | |
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
Hi Kevin,
Quote:
Thanks, |
|
07-15-2014, 10:22 AM | #915 |
Sigil Developer
Posts: 7,627
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Announcing: KindleUnpack_v073 a Stable Release
Hi All,
Attached is KindleUnpack_v073.zip. KindleUnpack version 0.73 is a public release that should be stable (he said hopefully...). There have been many recent additions to and features that are all incorporated into this release: - RESC parsing, fixed-layout support, cover generation [Thanks tkeo] - Unpacking to epub version 3 support if desired [Thanks to tkeo] - Much faster mobi splitting [Thanks to tkeo] - Greatly Improved GUI with full preferences support [Thanks to DiapDealer] - Support for converting PAGE sections into apnx files - Support for generating real page numbers and page-map.xml from either PAGE sections or associated .apnx files (if and only if that .apnx files was generated from real page numbers and not arbitrary values) - Support to unpack HDCONTAINER / CRES sections and have them overwrite images that had their resolutions lowered - lots and lots of bug fixes Both the command line and GUI interface have been modified to support these new features. The command line options now available are: Code:
python kindleunpack.py [-r -s -d -h -i] [-p APNX_FILE] INPUT_FILE OUTPUT_FOLDER INPUT_FILE - path to the desired Kindle/MobiPocket ebook OUTPUT_FOLDER - path to folder where the ebook will be unpacked Options: -h print this help message -i use HDImages to overwrite lower resolution versions, if present -s split combination mobis into older mobi and mobi KF8 ebooks -p APNX_FILE path to a .apnx file that contains real page numbers associated with an azw3 ebook (optional) Note: many apnx files have arbitrarily assigned page offsets that will confuse KindleUnpack if used --epub_version= specify epub version to unpack to: 2, 3 or A (for automatic), default is 2 -r write raw data to the output folder -d dump headers and other debug info to output and extra files Thanks, KevinH (for the development team) Last edited by KevinH; 07-15-2014 at 03:35 PM. |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Can i rotate text and insert images in Mobi and EPUB? | JanGLi | Kindle Formats | 5 | 02-02-2013 04:16 PM |
PDF to Mobi with text and images | pocketsprocket | Kindle Formats | 7 | 05-21-2012 07:06 AM |
Mobi files - images | DWC | Introduce Yourself | 5 | 07-06-2011 01:43 AM |
pdf to mobi... creating images rather than text | Dumhed | Calibre | 5 | 11-06-2010 12:08 PM |
Transfer of images on text files | anirudh215 | 2 | 06-22-2009 09:28 AM |