Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 01-14-2012, 12:18 PM   #31
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
Hi,

I ran "strings" on kindlegen and it appears to have the following option:

-donotaddsource

Has anyone tried the latest kindlegen to see if this works? Your file sizes should be a lot smaller.

KevinH
KevinH is online now   Reply With Quote
Old 01-14-2012, 12:50 PM   #32
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,463
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I think that might be a leftover from earlier versions. I still get an "unsupported argument" error when trying the -donotaddsource switch with the latest kindlegen.
DiapDealer is online now   Reply With Quote
Old 01-17-2012, 12:41 PM   #33
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
new version of a K8 aware kindlestrip program

Hi,

I modified kindlestrip_v130.py provided above to properly update the EXTH 121 metadata value if need be and now it appears to work just fine with the KindlePreviewer.

So here is an experimental kindlestrip_v132.py.zip that hopefully support K8 style mobis.

KevinH
Attached Files
File Type: zip kindlestrip_v132.py.zip (3.4 KB, 910 views)
KevinH is online now   Reply With Quote
Old 01-17-2012, 01:21 PM   #34
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,463
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by KevinH View Post
Hi,

I modified kindlestrip_v130.py provided above to properly update the EXTH 121 metadata value if need be and now it appears to work just fine with the KindlePreviewer.

So here is an experimental kindlestrip_v132.py.zip that hopefully support K8 style mobis.

KevinH
Great! I experienced no issues with the method that zeroed the section where the SRCS used to be, but who knows if it would have caused problems when submitting to Amazon or not. This approach seems more "official."
DiapDealer is online now   Reply With Quote
Old 01-30-2012, 12:42 PM   #35
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
Hi DiapDealer,

Based on looking at Nick's mobi_split.py code, it seems that the Mobi Header actually has a pointer and count to the SRCS record:

srcs_index = 224 (or 0xe0)
srcs_count = 228 (or 0xe4)

So I think we need a new version of kindlestrip.py that once it removes the SRCS section, it modifies the mobi (section 0) header to set 0xe0 to 0xffffffff and 0xe4 to 0.

We probably need to do that (or at least check) inside both the mobi7 header and the kf8 mobi header. We should also probably back-port this change to the original kindlestrip.py as well.

KevinH
KevinH is online now   Reply With Quote
Old 01-31-2012, 04:48 PM   #36
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
Hi DiapDealer,

I took a shot at using the srcs index and count info in a hopefully appropriate manner.

I called it kindlestrip_v133.py

I have only tested it in a limited fashion.

KevinH


Quote:
Originally Posted by KevinH View Post
Hi DiapDealer,

Based on looking at Nick's mobi_split.py code, it seems that the Mobi Header actually has a pointer and count to the SRCS record:

srcs_index = 224 (or 0xe0)
srcs_count = 228 (or 0xe4)

So I think we need a new version of kindlestrip.py that once it removes the SRCS section, it modifies the mobi (section 0) header to set 0xe0 to 0xffffffff and 0xe4 to 0.

We probably need to do that (or at least check) inside both the mobi7 header and the kf8 mobi header. We should also probably back-port this change to the original kindlestrip.py as well.

KevinH
Attached Files
File Type: zip kindlestrip_v133.py.zip (3.5 KB, 1039 views)
KevinH is online now   Reply With Quote
Old 06-04-2012, 04:57 AM   #37
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 71,406
Karma: 305065800
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
I've updated the first post in this thread to have the latest KindleStrip, as updated by KevinH (& updated by me to add him to the credits).

I've also updated the AppleScript Wrapper to include the latest version.
pdurrant is offline   Reply With Quote
Old 06-19-2012, 04:15 AM   #38
dilo_sec
Member
dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.
 
Posts: 21
Karma: 244219
Join Date: Jul 2011
Device: K3
tiny bug report

i just upgraded to the latest version - tiny bug report - which does not affect the function of this nice utility!

line 142 of kindlestrip_v134.py:
print " beginning at offset %0x and ending at offset %0x" % (srcs_offset, srcs_length)
should be:
print "beginning at offset %0x and ending at offset %0x" % (srcs_offset, next_offset-1)
or this:
print " beginning at offset %0x for length %0x" % (srcs_offset, srcs_length)
dilo_sec is offline   Reply With Quote
Old 06-19-2012, 08:15 AM   #39
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 71,406
Karma: 305065800
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
Quote:
Originally Posted by dilo_sec View Post
i just upgraded to the latest version - tiny bug report - which does not affect the function of this nice utility!

line 142 of kindlestrip_v134.py:
print " beginning at offset %0x and ending at offset %0x" % (srcs_offset, srcs_length)
should be:
print "beginning at offset %0x and ending at offset %0x" % (srcs_offset, next_offset-1)
or this:
print " beginning at offset %0x for length %0x" % (srcs_offset, srcs_length)
Thanks! Hopefully whoever makes the next significant changes will also include this one, which does need doing.
pdurrant is offline   Reply With Quote
Old 08-12-2012, 05:55 PM   #40
thomass
Wizard
thomass ought to be getting tired of karma fortunes by now.thomass ought to be getting tired of karma fortunes by now.thomass ought to be getting tired of karma fortunes by now.thomass ought to be getting tired of karma fortunes by now.thomass ought to be getting tired of karma fortunes by now.thomass ought to be getting tired of karma fortunes by now.thomass ought to be getting tired of karma fortunes by now.thomass ought to be getting tired of karma fortunes by now.thomass ought to be getting tired of karma fortunes by now.thomass ought to be getting tired of karma fortunes by now.thomass ought to be getting tired of karma fortunes by now.
 
Posts: 1,669
Karma: 2300001
Join Date: Mar 2011
Location: Türkiye
Device: Kindle 5.3.7
Thanks
thomass is offline   Reply With Quote
Old 08-20-2012, 04:14 AM   #41
dilo_sec
Member
dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.
 
Posts: 21
Karma: 244219
Join Date: Jul 2011
Device: K3
mobi with srcs_count = 2

Quote:
Originally Posted by KevinH View Post
Hi DiapDealer,

Based on looking at Nick's mobi_split.py code, it seems that the Mobi Header actually has a pointer and count to the SRCS record:

srcs_index = 224 (or 0xe0)
srcs_count = 228 (or 0xe4)

So I think we need a new version of kindlestrip.py that once it removes the SRCS section, it modifies the mobi (section 0) header to set 0xe0 to 0xffffffff and 0xe4 to 0.

We probably need to do that (or at least check) inside both the mobi7 header and the kf8 mobi header. We should also probably back-port this change to the original kindlestrip.py as well.

KevinH
When using kindlegen to convert an epub to mobi, I've come across a mobi file where srcs_count = 2.

kindlestrip.py displays:
KindleStrip v1.34. Written 2010-2012 by Paul Durrant and Kevin Hendricks.
Found SRCS section number 240, and count 2
Error: SRCS section num does not point to SRCS.

The 1st section (240) starts "PAGE" - this appears to be a "pageMap" section generated from the "page-map.xml" file in the epub - the section contains these strings:
"fileRevisionId" : "1"
"description" : "PageMap from source by kindlegen"

The 2nd section (241) starts "SRCS".

I think the "pageMap" should be retained, "SRCS" section stripped and srcs_count reduced by 1 - I'll attempt to code a fix and test it ...

Last edited by dilo_sec; 08-20-2012 at 09:38 AM. Reason: update with kindlestrip.py actual output
dilo_sec is offline   Reply With Quote
Old 08-22-2012, 11:18 AM   #42
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
Hi,

Having the pagemap.xml stored inside the mobi and not in a separate file is new (and interesting). I am not sure if it is something that needs to be stripped or not. The issue is the sanity check for SRCS. Perhaps that should allow SRCS or PAGE or maybe we need to be able to extract the PAGE information similar to how we extract the SRCS.

Would you please post a zip archive of a sample epub ebook that uses a pagemap.xml file so that we can run it through the latest kindlegen to see exactly what is being stored in the PAGE section of the mobi and if it is used or referenced anyplace else in the header.

Thanks,

KevinH
KevinH is online now   Reply With Quote
Old 08-22-2012, 03:08 PM   #43
dilo_sec
Member
dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.
 
Posts: 21
Karma: 244219
Join Date: Jul 2011
Device: K3
kindlestrip_v135.py

Quote:
Originally Posted by KevinH View Post
Would you please post a zip archive of a sample epub ebook that uses a pagemap.xml file so that we can run it through the latest kindlegen to see exactly what is being stored in the PAGE section of the mobi and if it is used or referenced anyplace else in the header.
I've finished coding and testing a fix to the 'feature' where srcs_cnt is greater than 1 - I simply scan the srcs_cnt sections, strip out the 'SRCS' section if found, and update the section references in the headers. kindlestrip_v135.py.zip is attached below.

Incidentally, kindlegens v1.2, v2.4 and the latest v2.5 all create a mobi with srcs_cnt = 2, using the files in sample epub (LOREM2.epub) - also attached below.

I've only tested in the Kindle Previewer (I'm Kindle-less at the moment!) using the 2 de-SRCS'ed mobi files I have (sample and ebook where I first came across srcs_cnt=2).
Attached Files
File Type: epub LOREM2.epub (3.7 KB, 711 views)
File Type: zip kindlestrip_v135.py.zip (3.7 KB, 728 views)
dilo_sec is offline   Reply With Quote
Old 08-23-2012, 11:05 PM   #44
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
inclusion of pagemap

Hi,

Thanks for posting LOREM2.epub. I used it with Kindlegen 2.5 and found that the page map information from page-map.xml info is somehow encoded (into position or byte offset info) and included in *both* the Mobi6 Header and the Mobi8 header inside the mobi.

I had never actually seen that before. The SRCS offset and count were never typically set in the Mobi8 header. But that makes sense as the formats are different enough that the Mobi8 version would need different page map information.

Here is what the latest version of DumpMobiHeader_v010.py shows for the kindlegen generated mobi (note the Section Map at the end as well):

kbhend$ python DumpMobiHeader_v010.py LOREM2.mobi
DumpMobiHeader v010
LOREM2.mobi .MOBI


First Header Dump from Section 0
Header Version is: 0x6
Header start position is: 0x0
Header Length is: 0xf8
Field: compression_type Offset: 0x000 Width: 2 Value: 0x02
Field: fill0 Offset: 0x002 Width: 2 Value: 0x00
Field: text_length Offset: 0x004 Width: 4 Value: 0x1796
Field: text_records Offset: 0x008 Width: 2 Value: 0x02
Field: max_section_size Offset: 0x00a Width: 2 Value: 0x1000
Field: crypto_type Offset: 0x00c Width: 2 Value: 0x00
Field: fill1 Offset: 0x00e Width: 2 Value: 0x00
Field: magic Offset: 0x010 Width: 4 Value: MOBI
Field: header_length Offset: 0x014 Width: 4 Value: 0x00f8
Field: type Offset: 0x018 Width: 4 Value: 0x0002
Field: codepage Offset: 0x01c Width: 4 Value: 0xfde9
Field: unique_id Offset: 0x020 Width: 4 Value: 0xaa53c38e
Field: version Offset: 0x024 Width: 4 Value: 0x0006
Field: metaorthindex Offset: 0x028 Width: 4 Value: 0xffffffff
Field: metainflindex Offset: 0x02c Width: 4 Value: 0xffffffff
Field: index_names Offset: 0x030 Width: 4 Value: 0xffffffff
Field: index_keys Offset: 0x034 Width: 4 Value: 0xffffffff
Field: extra_index0 Offset: 0x038 Width: 4 Value: 0xffffffff
Field: extra_index1 Offset: 0x03c Width: 4 Value: 0xffffffff
Field: extra_index2 Offset: 0x040 Width: 4 Value: 0xffffffff
Field: extra_index3 Offset: 0x044 Width: 4 Value: 0xffffffff
Field: extra_index4 Offset: 0x048 Width: 4 Value: 0xffffffff
Field: extra_index5 Offset: 0x04c Width: 4 Value: 0xffffffff
Field: first_nontext Offset: 0x050 Width: 4 Value: 0x0003
Field: title_offset Offset: 0x054 Width: 4 Value: 0x0238
Field: title_length Offset: 0x058 Width: 4 Value: 0x000a
Field: language_code Offset: 0x05c Width: 4 Value: 0x0009
Field: dict_in_lang Offset: 0x060 Width: 4 Value: 0x0000
Field: dict_out_lang Offset: 0x064 Width: 4 Value: 0x0000
Field: min_version Offset: 0x068 Width: 4 Value: 0x0006
Field: first_resc_offset Offset: 0x06c Width: 4 Value: 0x0006
Field: huff_offset Offset: 0x070 Width: 4 Value: 0x0000
Field: huff_num Offset: 0x074 Width: 4 Value: 0x0000
Field: huff_tbl_offset Offset: 0x078 Width: 4 Value: 0x0000
Field: huff_tbl_len Offset: 0x07c Width: 4 Value: 0x0000
Field: exth_flags Offset: 0x080 Width: 4 Value: 0x0858
Field: fill3_a Offset: 0x084 Width: 4 Value: 0x0000
Field: fill3_b Offset: 0x088 Width: 4 Value: 0x0000
Field: fill3_c Offset: 0x08c Width: 4 Value: 0x0000
Field: fill3_d Offset: 0x090 Width: 4 Value: 0x0000
Field: fill3_e Offset: 0x094 Width: 4 Value: 0x0000
Field: fill3_f Offset: 0x098 Width: 4 Value: 0x0000
Field: fill3_g Offset: 0x09c Width: 4 Value: 0x0000
Field: fill3_h Offset: 0x0a0 Width: 4 Value: 0x0000
Field: drm_offset Offset: 0x0a8 Width: 4 Value: 0xffffffff
Field: drm_count Offset: 0x0ac Width: 4 Value: 0x0000
Field: drm_size Offset: 0x0b0 Width: 4 Value: 0x0000
Field: drm_flags Offset: 0x0b4 Width: 4 Value: 0x0000
Field: fill4_a Offset: 0x0b8 Width: 4 Value: 0x0000
Field: fill4_b Offset: 0x0bc Width: 4 Value: 0x0000
Field: first_content Offset: 0x0c0 Width: 2 Value: 0x01
Field: last_content Offset: 0x0c2 Width: 2 Value: 0x06
Field: unknown0 Offset: 0x0c4 Width: 4 Value: 0x0001
Field: fcis_offset Offset: 0x0c8 Width: 4 Value: 0x0008
Field: fcis_count Offset: 0x0cc Width: 4 Value: 0x0001
Field: flis_offset Offset: 0x0d0 Width: 4 Value: 0x0007
Field: flis_count Offset: 0x0d4 Width: 4 Value: 0x0001
Field: unknown1 Offset: 0x0d8 Width: 4 Value: 0x0000
Field: unknown2 Offset: 0x0dc Width: 4 Value: 0x0000
Field: srcs_offset Offset: 0x0e0 Width: 4 Value: 0x0009
Field: srcs_count Offset: 0x0e4 Width: 4 Value: 0x0002
Field: unknown3 Offset: 0x0e8 Width: 4 Value: 0xffffffff
Field: unknown4 Offset: 0x0ec Width: 4 Value: 0xffffffff
Field: fill5 Offset: 0x0f0 Width: 2 Value: 0x00
Field: traildata_flags Offset: 0x0f2 Width: 2 Value: 0x03
Field: ncx_index Offset: 0x0f4 Width: 4 Value: 0x0003
Field: unknown5 Offset: 0x0f8 Width: 4 Value: 0xffffffff
Field: unknown6 Offset: 0x0fc Width: 4 Value: 0xffffffff
Field: datp_offset Offset: 0x100 Width: 4 Value: 0xffffffff
Field: unknown7 Offset: 0x104 Width: 4 Value: 0xffffffff
Extra Region Length: 0x0
EXTH Region Length: 0x2130
EXTH MetaData

Key: "Published"
Value: "2012-08-20"

Key: "Creator"
Value: "E X Ample"

Key: "Subject"
Value: "Sample Text"

Key: "Description"
Value: "Sample Text"

Key: "Language_(524)"
Value: "en"

Key: "TextDirection"
Value: "horizontal-lr"

Key: "K8(129)_Masthead/Cover_Image"
Value: "kindle:embed:0001"

Key: "K8(131)_Unidentified_Count"
Value: 0x0000

Key: "StartOffset"
Value: 0x027b

Key: "Font Signature (hex)"
Value: 0x010000000000000000000000000000800000000000000000 0000000000000000bef4edec

Key: "Creator Software"
Value: 0x00ca

Key: "Creator Major Version"
Value: 0x0002

Key: "Creator Minor Version"
Value: 0x0005

Key: "Kindlegen_BuildRev_Number"
Value: "0626-3a91e28"

Key: "Creator Build Number"
Value: 0x0000

Key: "K8(125)_Count_of_Resources_Fonts_Images"
Value: 0x0001

Key: "K8(121)_Boundary_Section"
Value: 0x000c


Mobi Ebook uses the new dual mobi/KF8 file format

Second Header Dump from Section 12
Header Version is: 0x8
Header start position is: 0xc
Header Length is: 0xf8
Field: compression_type Offset: 0x000 Width: 2 Value: 0x02
Field: fill0 Offset: 0x002 Width: 2 Value: 0x00
Field: text_length Offset: 0x004 Width: 4 Value: 0x19df
Field: text_records Offset: 0x008 Width: 2 Value: 0x02
Field: max_section_size Offset: 0x00a Width: 2 Value: 0x1000
Field: crypto_type Offset: 0x00c Width: 2 Value: 0x00
Field: fill1 Offset: 0x00e Width: 2 Value: 0x00
Field: magic Offset: 0x010 Width: 4 Value: MOBI
Field: header_length Offset: 0x014 Width: 4 Value: 0x00f8
Field: type Offset: 0x018 Width: 4 Value: 0x0002
Field: codepage Offset: 0x01c Width: 4 Value: 0xfde9
Field: unique_id Offset: 0x020 Width: 4 Value: 0xaa53c38e
Field: version Offset: 0x024 Width: 4 Value: 0x0008
Field: metaorthindex Offset: 0x028 Width: 4 Value: 0x0004
Field: metainflindex Offset: 0x02c Width: 4 Value: 0xffffffff
Field: index_names Offset: 0x030 Width: 4 Value: 0xffffffff
Field: index_keys Offset: 0x034 Width: 4 Value: 0xffffffff
Field: extra_index0 Offset: 0x038 Width: 4 Value: 0xffffffff
Field: extra_index1 Offset: 0x03c Width: 4 Value: 0xffffffff
Field: extra_index2 Offset: 0x040 Width: 4 Value: 0xffffffff
Field: extra_index3 Offset: 0x044 Width: 4 Value: 0xffffffff
Field: extra_index4 Offset: 0x048 Width: 4 Value: 0xffffffff
Field: extra_index5 Offset: 0x04c Width: 4 Value: 0xffffffff
Field: first_nontext Offset: 0x050 Width: 4 Value: 0x0004
Field: title_offset Offset: 0x054 Width: 4 Value: 0x0238
Field: title_length Offset: 0x058 Width: 4 Value: 0x000a
Field: language_code Offset: 0x05c Width: 4 Value: 0x0009
Field: dict_in_lang Offset: 0x060 Width: 4 Value: 0x0000
Field: dict_out_lang Offset: 0x064 Width: 4 Value: 0x0000
Field: min_version Offset: 0x068 Width: 4 Value: 0x0008
Field: first_resc_offset Offset: 0x06c Width: 4 Value: 0x000f
Field: huff_offset Offset: 0x070 Width: 4 Value: 0x0000
Field: huff_num Offset: 0x074 Width: 4 Value: 0x0000
Field: huff_tbl_offset Offset: 0x078 Width: 4 Value: 0x0000
Field: huff_tbl_len Offset: 0x07c Width: 4 Value: 0x0000
Field: exth_flags Offset: 0x080 Width: 4 Value: 0x0058
Field: fill3_a Offset: 0x084 Width: 4 Value: 0x0000
Field: fill3_b Offset: 0x088 Width: 4 Value: 0x0000
Field: fill3_c Offset: 0x08c Width: 4 Value: 0x0000
Field: fill3_d Offset: 0x090 Width: 4 Value: 0x0000
Field: fill3_e Offset: 0x094 Width: 4 Value: 0x0000
Field: fill3_f Offset: 0x098 Width: 4 Value: 0x0000
Field: fill3_g Offset: 0x09c Width: 4 Value: 0x0000
Field: fill3_h Offset: 0x0a0 Width: 4 Value: 0x0000
Field: unknown0 Offset: 0x0a4 Width: 4 Value: 0xffffffff
Field: drm_offset Offset: 0x0a8 Width: 4 Value: 0xffffffff
Field: drm_count Offset: 0x0ac Width: 4 Value: 0x0000
Field: drm_size Offset: 0x0b0 Width: 4 Value: 0x0000
Field: drm_flags Offset: 0x0b4 Width: 4 Value: 0x0000
Field: fill4_a Offset: 0x0b8 Width: 4 Value: 0x0000
Field: fill4_b Offset: 0x0bc Width: 4 Value: 0x0000
Field: fdst_offset Offset: 0x0c0 Width: 4 Value: 0x1000e
Field: fdst_flow_count Offset: 0x0c4 Width: 4 Value: 0x0001
Field: fcis_offset Offset: 0x0c8 Width: 4 Value: 0x0010
Field: fcis_count Offset: 0x0cc Width: 4 Value: 0x0001
Field: flis_offset Offset: 0x0d0 Width: 4 Value: 0x000f
Field: flis_count Offset: 0x0d4 Width: 4 Value: 0x0001
Field: unknown1 Offset: 0x0d8 Width: 4 Value: 0x0000
Field: unknown2 Offset: 0x0dc Width: 4 Value: 0x0000
Field: srcs_offset Offset: 0x0e0 Width: 4 Value: 0x0011
Field: srcs_count Offset: 0x0e4 Width: 4 Value: 0x0001
Field: unknown3 Offset: 0x0e8 Width: 4 Value: 0xffffffff
Field: unknown4 Offset: 0x0ec Width: 4 Value: 0xffffffff
Field: fill5 Offset: 0x0f0 Width: 2 Value: 0x00
Field: traildata_flags Offset: 0x0f2 Width: 2 Value: 0x03
Field: ncx_index Offset: 0x0f4 Width: 4 Value: 0x000c
Field: fragment_index Offset: 0x0f8 Width: 4 Value: 0x0004
Field: skeleton_index Offset: 0x0fc Width: 4 Value: 0x0007
Field: datp_offset Offset: 0x100 Width: 4 Value: 0x0012
Field: guide_index Offset: 0x104 Width: 4 Value: 0x0009
Extra Region Length: 0x0
EXTH Region Length: 0x213c
EXTH MetaData

Key: "Published"
Value: "2012-08-20"

Key: "Creator"
Value: "E X Ample"

Key: "Subject"
Value: "Sample Text"

Key: "Description"
Value: "Sample Text"

Key: "Language_(524)"
Value: "en"

Key: "TextDirection"
Value: "horizontal-lr"

Key: "K8(129)_Masthead/Cover_Image"
Value: "kindle:embed:0001"

Key: "K8(131)_Unidentified_Count"
Value: 0x0000

Key: "StartOffset"
Value: 0x027b

Key: "StartOffset"
Value: 0x0314

Key: "Font Signature (hex)"
Value: 0x010000000000000000000000000000800000000000000000 0000000000000000bebcaff0

Key: "Creator Software"
Value: 0x00ca

Key: "Creator Major Version"
Value: 0x0002

Key: "Creator Minor Version"
Value: 0x0005

Key: "Kindlegen_BuildRev_Number"
Value: "0626-3a91e28"

Key: "Creator Build Number"
Value: 0x0000

Key: "K8(125)_Count_of_Resources_Fonts_Images"
Value: 0x0000

Map of Palm DB Sections
Dec - Hex : Description
---- - ---- -----------
0000 - 0000: HEADER 6
0001 - 0001: Text Record 0
0002 - 0002: Text Record 1
0003 - 0003: NCX Index 0
0004 - 0004: NCX Index 1
0005 - 0005: NCX Index CNX
0006 - 0006: RESC
0007 - 0007: FLIS
0008 - 0008: FCIS
0009 - 0009: Source Archive 0
0010 - 000a: Source Archive 1
0011 - 000b: BOUNDARY
0012 - 000c: HEADER 8
0013 - 000d: Text Record 0
0014 - 000e: Text Record 1
0015 - 000f: 0000
0016 - 0010: Fragment Index 0
0017 - 0011: Fragment Index 1
0018 - 0012: Fragment Index CNX
0019 - 0013: Skeleton Index 0
0020 - 0014: Skeleton Index_Index 1
0021 - 0015: Guide Index 0
0022 - 0016: Guide Index 1
0023 - 0017: Guide Index CNX
0024 - 0018: NCX Index 0
0025 - 0019: NCX Index 1
0026 - 001a: NCX Index CNX
0027 - 001b: FLIS
0028 - 001c: FCIS
0029 - 001d: Source Archive 0
0030 - 001e: DATP
0031 - 001f: EOF_RECORD


So this is interesting indeed. So we need to figure out how that page-map.xml tag entries are converted and stored in the new kindlegen page map that is stored in the PAGE sections in both the Mobi6 and Mobi8 parts of the .mobi file.

Once we add grok that we should be able to add support for unpacking that information using Mobi_Unpack and then figure out a way for Calibre to generate that page information as well for its joint kf8 and .azw3 files.

Thanks for pointing this out.

Kevin
KevinH is online now   Reply With Quote
Old 08-24-2012, 04:37 AM   #45
dilo_sec
Member
dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.dilo_sec ought to be getting tired of karma fortunes by now.
 
Posts: 21
Karma: 244219
Join Date: Jul 2011
Device: K3
KevinH,

The pagemap section maps the page numbers to offsets into the raw (uncompressed) html in the mobi file - (only) the offsets are different for Mobi6 and Mobi8 parts.

I've decoded the pagemap section as follows: (the use of some values are unknown?)

-- start
0x0000: 50414745 PAGE

0x0004: 00000008 ?
0x0008: 00010001 ?
0x000C: 0000002A ?

0x0010: 0000 block 0?
0x0012: 001E size of block, 0x0032-0x0014 = 1E

0x0014: 7B0A {
0x0016: 2020 20226669 ... "fileRevisionId" : "1"
0x0030: 7D0A }

0x0032: 0001 block 1?
0x0034: 0054 size of block, 0x008E-0x003A = 5A
0x0036: 000A pages in pagemap
0x0038: 0010 ?

0x003A: 7B0A {
0x003C: 2020 20226465 ... "description" : "PageMap from source by kindlegen",
0x0073: 2020 20227061 ... "pageMap" : "(1,a,1)"
0x008C: 7D0A }

0x008E: 0281 page_1
0x0090: 0500 page_2
0x0092: 074A page_3
0x0094: 0959 page_4
0x0096: 0B49 page_5
0x0098: 0DCA page_6
0x009A: 1014 page_7
0x009C: 1176 page_8
0x009E: 1413 page_9
0x00A0: 164E page_10
--end pagemap

Apologies about the formatting - all the tabs and mutiple spaces have been scrunched up to 1 space!

Hope this is useful ...

DS.
dilo_sec is offline   Reply With Quote
Reply

Tags
k5 tools, mobi2mobi

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Applescript Wrapper Application for Kindlegen pdurrant Kindle Formats 50 02-18-2020 01:16 AM
how to use python script with windows xp tuufbiz1 Other formats 12 01-08-2011 08:22 AM
How do I get a shortcut for a Python script onto the taskbar in W7? Sydney's Mom Workshop 6 03-28-2010 08:11 PM
Nedd a little help with a python script gandor62 Calibre 1 08-07-2008 09:59 PM
Python script to create collections gwynevans Sony Reader Dev Corner 2 03-13-2008 12:29 PM


All times are GMT -4. The time now is 08:30 AM.


MobileRead.com is a privately owned, operated and funded community.