Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 07-12-2014, 10:01 AM   #901
tkeo
Connoisseur
tkeo began at the beginning.
 
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
Hi,
Quote:
Originally Posted by AcidWeb View Post
orientation-lock have three valid values: portrait, landscape and none
I think you are right, but kindlegen.exe v2.9 does not convert EXTH to orientation-lock none from
Code:
<meta "property"="rendition:orientation">auto</meta>
if rendition: orientation is auto, no EXTH orientation-lock is added.

Thanks,
tkeo is offline   Reply With Quote
Old 07-12-2014, 10:03 AM   #902
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,627
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi Paul and DiapDealer ( and anyone else with Kindle ebook collections ),

Do you own an Amazon ebook that uses EXTH 508, 517 or 522? Do you know who reverse engineered those EXTH values?

One quick thing to try is to cd to your My Kindle Content directory and run DumpMobiHeader_v016 (or later) on the *.azw ebook files. It will dump the EXTH even if the ebook is DRM'd since the headers themselves are not encrypted.

I then redirect all the output to a big text file and then use grep to find those tag values.

I would love to know if any of those supposed file-as EXTH values are ever set. If do, I will try to grab a sample of that book to see if I can figure out how they were set and why?

Thanks,

KevinH
KevinH is online now   Reply With Quote
Advert
Old 07-12-2014, 10:11 AM   #903
tkeo
Connoisseur
tkeo began at the beginning.
 
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
Quote:
Originally Posted by KevinH View Post
I don't know where or who reversed those tags meanings. I have searched and none of my Amazon ebooks have any of those EXTH tags set.
I have books which have those EXTH tags. They are corresponding to yomigana(pronunciations) of kanji characters in Japanese.
tkeo is offline   Reply With Quote
Old 07-12-2014, 10:48 AM   #904
tkeo
Connoisseur
tkeo began at the beginning.
 
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
Amazon ebook that uses EXTH 508, 517 or 522

Hi,

The sample of the following book has Creator file-as and Title file-as EXTH. The sample is no-DRM.

Zenyaku Genji-Monogatari (Japanese Edition) ASIN: B00BHHKABO

http://www.amazon.co.jp/%E5%85%A8%E8...89%A9%E8%AA%9E
http://www.amazon.com/Zenyaku-Genji-...s=genji+kindle

Thanks,
tkeo is offline   Reply With Quote
Old 07-12-2014, 11:17 AM   #905
tkeo
Connoisseur
tkeo began at the beginning.
 
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
Hi,

I have modified KindleUnpack v0.72z to fix bugs and to simplify the code.

Except for 'refines' tags and excluding epub3 tags in epub2, I think I have done in my mind.

Thanks,
tkeo
Attached Files
File Type: txt mobi_opf_z1.patch.txt (12.0 KB, 198 views)
tkeo is offline   Reply With Quote
Advert
Old 07-12-2014, 08:45 PM   #906
tkeo
Connoisseur
tkeo began at the beginning.
 
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
Hi Kevin,
Quote:
Originally Posted by KevinH View Post
Do you own an Amazon ebook that uses EXTH 508, 517 or 522? Do you know who reverse engineered those EXTH values?
I added those EXTH at KindleUnpack v0.63.
I had seem somewhere on the internet that file-as meta were used as yomigana (or hurigana) to build an epub. So, I thought EXTH 508, 517 and 522 were corresponded to meta file-as of epub3.

Now I have a guess that EXTH 508, 517 and 522 are converted from
<meta name="???kana" content="XXXX"/> or <meta name="???gana" content="XXXX"/>.

Thanks,
tkeo is offline   Reply With Quote
Old 07-12-2014, 11:53 PM   #907
tkeo
Connoisseur
tkeo began at the beginning.
 
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
Hi Kevin,

Here is another patch for KindleUnpack v0.72z. It includes the patch I posted before. In addition, It has

More simplification of mobi_opf.py
Addition of print message before makeEPUB() in kindleunpack.py

Take care,
tkeo
Attached Files
File Type: txt KindleUnpack_v072z.patch.txt (14.2 KB, 184 views)
tkeo is offline   Reply With Quote
Old 07-13-2014, 09:33 AM   #908
tkeo
Connoisseur
tkeo began at the beginning.
 
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
faster mobi_split.py

Hi,

This is the faster version of mobi_split.py.
I have removed the code for debug from which posted before.

The comparison of processing performances is as follows,

tested mobi file: 26MB 164 images

original: 6.3s
modified: 0.5s

To Kevin,
Please include this in the next official release if possible.

Take care,
tkeo
Attached Files
File Type: txt mobi_split_faster.patch.txt (4.5 KB, 179 views)
File Type: zip mobi_split.py.zip (3.9 KB, 189 views)
tkeo is offline   Reply With Quote
Old 07-13-2014, 08:28 PM   #909
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,627
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi tkeo,
Have you tested the new mobi_split code to make sure that it is still building mobi7 and azw3 pieces completely correctly? I have not had time to look it over yet, but if you are sure, I will include it.

I also have found a few more bugs in KindleUnpack that I will post a patch for either later tonight my time or tomorrow. I have changes for fixing the <image> tag in the svg mobi_cover to be a single type tag (similar to how the img tag is a single tag) ... it seems kindlegen requires that change; and changes in mobi_k8proc.py to both ignore meta tags and stop searching for id= or the older name= attributes when searching for a link target.

In addition, I want to review the hasNCX variable, as some older mobi 4 versions (and older) do not have an ncx index. In the old days we simply did not create a toc.ncx for them, but somehow over the years that code got modified to always create a toc.ncx even though it will be empty. This will mean further code changes in the mobi_opf to deal with that remaining issue. I would like to fix that as well since your change seems to always believe this will be true but under odd circumstances, it won't be.

I also want to remove the mistaken "file-as" EXTH values in mobi-header.py and set a few new values I have found so as not to confuse others who might use this code as the basis for their own.

Hopefully, I will be able to release a stable version by Tuesday at the latest.

Take care,

KevinH
KevinH is online now   Reply With Quote
Old 07-13-2014, 10:38 PM   #910
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,627
Karma: 5433388
Join Date: Nov 2009
Device: many
patch from v072z_test to hopefully v073

Hi tkeo,

Here is a patch that takes v072z_test up to v073 (hopefully!). It includes your latest cumulative patch as well as your faster mobi_split patch as well as a few minor bug fixes from my end as described in my previous post. I have also made a few things a bit more consistent in the mobi_header.py code and hopefully have dealt with CoverOffset's that are 0xffffffff as well (given your earlier post on that subject).

I have decided not to play with the hasNCX stuff and not building a toc.ncx for older Mobi 4's until after the stable release as I didn't want to introduce changes that will break things.

Please give it a good testing with all of your Amazon ebooks and let me know if you feel it is now ready for a stable release.

If so, I will make the stable release Tuesday evening my time.

Thanks!

KevinH
Attached Files
File Type: zip preview_v072z_to_v073_patch.txt.zip (8.0 KB, 157 views)
KevinH is online now   Reply With Quote
Old 07-14-2014, 08:24 AM   #911
tkeo
Connoisseur
tkeo began at the beginning.
 
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
Hi Kevin,
Quote:
Originally Posted by KevinH View Post
Have you tested the new mobi_split code to make sure that it is still building mobi7 and azw3 pieces completely correctly? I have not had time to look it over yet, but if you are sure, I will include it.
I have tested with 10 mobi files, 2 of which have HD images and 1 of which has no RESC. The splitted files are identical to ones generated by older mobi_split.py.

I have fixed a bug in taginfo_toxml() of mobi_k8resc.py and modified mobi_header.py.
Quote:
I also want to remove the mistaken "file-as" EXTH values in mobi-header.py and set a few new values I have found so as not to confuse others who might use this code as the basis for their own.
I have changed to
508 : 'Unknown_Title_Furigana?_(508)',
517 : 'Unknown_Creator_Furigana?_(517)',
522 : 'Unknown_Publisher_Furigana?_(522)',
in dump_contexth(cpage, extheader).
Those in class MobiHeader are not changed.

Quote:
hopefully have dealt with CoverOffset's that are 0xffffffff as well (given your earlier post on that subject).
I have modified this part too since int('0xffffffff') cannot convert to an long integer.
Code:
>>> int('0xffffffff')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '0xffffffff'
>>>
I attach a patch. Hopefully, it is the final patch!

BTW,
prefs.py has CRLF line ending instead of LF.

Take care,
tkeo
Attached Files
File Type: txt v073_patch.txt (3.4 KB, 297 views)
tkeo is offline   Reply With Quote
Old 07-14-2014, 10:00 AM   #912
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,627
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi tkeo,

Still don't like the comparison against sys.maxint as that changes with machine. I simply want to check for one specific missing value 0xffffffff as we do with the start offset later on in KindleUnpack and many places in the header. I will fix that. If it is some other invalid value, I want to know that and let the program barf appropriately so we figure out how they have changed setting of CoverOffset. I will add my fix to the dump EXTH code as well. Also, do you have a specific testcase you use with that?

Thanks for catching the extra quotes bug in mobi_k8resc.py. I will remove the extra crs from prefs.py to keep it consistent with the other files.

Edit:

Here is how I am now handling the potentially missing CoverOffset issue (if that is what it even is). I am suspicious that someone has used an improperly written meta data editor and messed up the EXTH size fields somehow. If that is the case, I would rather we fail out as it will help us better detect where and when this is happening.

From mobi_header.py in parseMetaData(self)

Code:
        if self.hasExth:
            extheader=self.exth
            _length, num_items = struct.unpack('>LL', extheader[4:12])
            extheader = extheader[12:]
            pos = 0
            for _ in range(num_items):
                id, size = struct.unpack('>LL', extheader[pos:pos+8])
                content = extheader[pos + 8: pos + size]
                if id in MobiHeader.id_map_strings.keys():
                    name = MobiHeader.id_map_strings[id]
                    addValue(name, unicode(content, codec).encode('utf-8'))
                elif id in MobiHeader.id_map_values.keys():
                    name = MobiHeader.id_map_values[id]
                    if size == 9:
			value, = struct.unpack('B',content)
                        addValue(name, str(value))
                    elif size == 10:
                        value, = struct.unpack('>H',content)
                        addValue(name, str(value))
                    elif size == 12:
                        value, = struct.unpack('>L',content)
                        # handle special case of missing CoverOffset                                                            
                        if id != 201 or value != 0xffffffff:
                            addValue(name, str(value))
                    else:
                        print "Warning: Bad key, size, value combination detected in EXTH ", id, size, content.encode('hex')
                        addValue(name, content.encode('hex'))
Thanks,

KevinH

Quote:
Originally Posted by tkeo View Post
Hi Kevin,

I have tested with 10 mobi files, 2 of which have HD images and 1 of which has no RESC. The splitted files are identical to ones generated by older mobi_split.py.

I have fixed a bug in taginfo_toxml() of mobi_k8resc.py and modified mobi_header.py.

I have changed to
508 : 'Unknown_Title_Furigana?_(508)',
517 : 'Unknown_Creator_Furigana?_(517)',
522 : 'Unknown_Publisher_Furigana?_(522)',
in dump_contexth(cpage, extheader).
Those in class MobiHeader are not changed.


I have modified this part too since int('0xffffffff') cannot convert to an long integer.
Code:
>>> int('0xffffffff')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '0xffffffff'
>>>
I attach a patch. Hopefully, it is the final patch!

BTW,
prefs.py has CRLF line ending instead of LF.

Take care,
tkeo

Last edited by KevinH; 07-14-2014 at 12:03 PM.
KevinH is online now   Reply With Quote
Old 07-14-2014, 11:11 AM   #913
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,545
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
BTW,
prefs.py has CRLF line ending instead of LF.
Well that's odd ... but almost certainly entirely my fault.
DiapDealer is offline   Reply With Quote
Old 07-15-2014, 08:16 AM   #914
tkeo
Connoisseur
tkeo began at the beginning.
 
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
Hi Kevin,
Quote:
Originally Posted by KevinH View Post
I simply want to check for one specific missing value 0xffffffff as we do with the start offset later on in KindleUnpack and many places in the header.
I misunderstood the value is a string of '0xffffffff' instead of 0xffffffff. So my modification is not necessary.

Thanks,
tkeo is offline   Reply With Quote
Old 07-15-2014, 10:22 AM   #915
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,627
Karma: 5433388
Join Date: Nov 2009
Device: many
Announcing: KindleUnpack_v073 a Stable Release

Hi All,

Attached is KindleUnpack_v073.zip. KindleUnpack version 0.73 is a public release that should be stable (he said hopefully...).

There have been many recent additions to and features that are all incorporated into this release:

- RESC parsing, fixed-layout support, cover generation [Thanks tkeo]

- Unpacking to epub version 3 support if desired [Thanks to tkeo]

- Much faster mobi splitting [Thanks to tkeo]

- Greatly Improved GUI with full preferences support [Thanks to DiapDealer]

- Support for converting PAGE sections into apnx files

- Support for generating real page numbers and page-map.xml from either PAGE sections or associated .apnx files (if and only if that .apnx files was generated from real page numbers and not arbitrary values)

- Support to unpack HDCONTAINER / CRES sections and have them overwrite images that had their resolutions lowered

- lots and lots of bug fixes

Both the command line and GUI interface have been modified to support these new features.

The command line options now available are:

Code:
python kindleunpack.py [-r -s -d -h -i] [-p APNX_FILE] INPUT_FILE OUTPUT_FOLDER


   INPUT_FILE      - path to the desired Kindle/MobiPocket ebook

   OUTPUT_FOLDER   - path to folder where the ebook will be unpacked

Options:

    -h               print this help message

    -i               use HDImages to overwrite lower resolution versions, if present

    -s               split combination mobis into older mobi and mobi KF8 ebooks

    -p APNX_FILE     path to a .apnx file that contains real page numbers associated with an azw3 ebook (optional)
                     Note: many apnx files have arbitrarily assigned page offsets that will confuse KindleUnpack if used

   --epub_version=   specify epub version to unpack to: 2, 3 or A (for automatic), default is 2

    -r               write raw data to the output folder

    -d               dump headers and other debug info to output and extra files
Please give it a good workout and report any bugs here. Hope you all find this useful.

Thanks,

KevinH (for the development team)
Attached Files
File Type: zip KindleUnpack_v073.zip (85.4 KB, 194 views)

Last edited by KevinH; 07-15-2014 at 03:35 PM.
KevinH is online now   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Can i rotate text and insert images in Mobi and EPUB? JanGLi Kindle Formats 5 02-02-2013 04:16 PM
PDF to Mobi with text and images pocketsprocket Kindle Formats 7 05-21-2012 07:06 AM
Mobi files - images DWC Introduce Yourself 5 07-06-2011 01:43 AM
pdf to mobi... creating images rather than text Dumhed Calibre 5 11-06-2010 12:08 PM
Transfer of images on text files anirudh215 PDF 2 06-22-2009 09:28 AM


All times are GMT -4. The time now is 07:21 AM.


MobileRead.com is a privately owned, operated and funded community.