View Single Post
Old 01-04-2015, 06:30 PM   #1115
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,809
Karma: 6000000
Join Date: Nov 2009
Device: many
Hi elchamaco,

Yes this is very different from all other dictionaries I have seen.

Examining the mobi7 header shows that there is no meta inflexion index
at all but a new index called index_names does seem to exist and be used.

Here is a snippet from the Mobi7 header. See the metainfindex field has been set to missing 0xFFFFFFFF but something called index_names does have a non 0xFFFFFFFF value and it points to a set of unknown indexes further along than the metaorthindex.


Code:
Dumping section 0, Mobipocket Header version: 7, total length 272
Mobipocket header from section 0
     Offset  Value Hex Dec        Description
0x000 (  0)     0x0002          2 compression_type
0x002 (  2)     0x0000          0 fill0
0x004 (  4) 0x054DEC84   88992900 text_length
0x008 (  8)     0x54DF      21727 text_records
0x00A ( 10)     0x1000       4096 max_section_size
0x00C ( 12)     0x0000          0 crypto_type
0x00E ( 14)     0x0000          0 fill1
0x010 ( 16)       MOBI            magic
0x014 ( 20) 0x00000100        256 header_length (from MOBI)
0x018 ( 24) 0x00000002          2 type
0x01C ( 28) 0x0000FDE9      65001 codepage
0x020 ( 32) 0x434E1E13 1129192979 unique_id
0x024 ( 36) 0x00000007          7 version
0x028 ( 40) 0x000054E1      21729 metaorthindex
0x02C ( 44) 0xFFFFFFFF 4294967295 metainflindex
0x030 ( 48) 0x000055E7      21991 index_names
0x034 ( 52) 0xFFFFFFFF 4294967295 index_keys

Here are snippets from the map of sections in the mobi ebook:

Code:
The metaorthindex points here:

21729 54E1  0x1C9B030 0x01F04    43458       0 Unknown INDX section, extracting as Unknown21729_INDX.dat
21730 54E2  0x1C9CF34 0x0FBF0    43460       0 Unknown INDX section, extracting as Unknown21730_INDX.dat
21731 54E3  0x1CACB24 0x0FBEC    43462       0 Unknown INDX section, extracting as Unknown21731_INDX.dat
...

--snip--

The index_names point here.

21991 55E7  0x2C72C3C 0x000E0    43982       0 Unknown INDX section, extracting as Unknown21991_INDX.dat
21992 55E8  0x2C72D1C 0x000CC    43984       0 Unknown INDX section,

So the current dictionary code will not deal with this at all as it never even looks at the "index_names" nor does it know how to interpret its data.

I think that field name was based on information in the Wiki here and there may be no one who remembers why it was named "index_names" as I did not name it.

So supporting these strange dictionaries which have errors in their own kindlegen logs and who use who use no inflections index may require a big reverse engineering effort.

This is not something I can take on soon. But if you post the dictionary someplace that is not so swamped by unknown attacks (say www.datafilehost.com where I at least know how to avoid the issues) and send a personal mail on this site to "KevinH" with the link. I will take a look at eventually supporting it, just not right now as I am tied up with Sigil projects.

Thanks,

KevinH

Quote:
Originally Posted by elchamaco View Post
Hi,

I generated the log with 0.77. The unpack gives the same result as 0.75.

Regards.
KevinH is offline   Reply With Quote