Quote:
Originally Posted by KevinH
This is probably old hat to siebert but is all new to me so if anyone has any ideas how to properly decipher the "type" value to map it to the fields that are stored there, it would certainly help.
|
First of all you have to decode the TAGX section for your index. I've documented that in the Wiki (
https://wiki.mobileread.com/wiki/MOBI#TAGX_section).
Then you can decode the index entries with the tag table.
Each entry starts with the control byte(s) (the control byte count is defined in the meta index). Using the bit masks from the tag table you can decode which tags are in that index entry and how many entries of each tag.
A bit mask could theoretically contain more than two bits, but I've seen so far only one and two bit masks. If a two-bit mask is all set to 1, it doesn't mean 4 entries of that tag, but after the control byte(s) is another value defining how many entries of that tag are in the entry.
So the control bytes encodes 0, 1, 2, 3 or many entries.
The tag table also defines, how many values each tag has.
With that information you can get all values from an index entry. If you know the meaning of the tag, you can use the values to get the necessary information.
Example:
Control byte count is 1. The tag table has three entries:
0x08, 0x01, 0x03, 0x00 (tag 0x08 has one value and the bitmask 0b11)
0x0a, 0x02, 0x04, 0x00 (tag 0x0a has two values and the bitmask 0b100)
0x00, 0x00, 0x00, 0x01 (end of control byte indictator)
If the first byte of an index entry is 0b00000111, we do an AND operation with the first bitmask and see that the result is 0b11, meaning we must read the next byte to get the actual count of tag 0x08 entries. Let this value be 0x05.
Now we do an AND operation with the next mask and get the result 0b1, so we know that there is one 0x0a entry.
So we've already processed the first two bytes and must now read 5 variable length values for the 5 0x08 tags and 2 variable length values for the one 0x0a tag (as each 0x0a entry contains two values).
If the control byte is 0b00000010, we must read two variable length values for two 0x08 tags.
That's all
I hope it's now clear how to decode an index entry and that I didn't make any mistakes in my description.
As I've said before, the code for this handling is already available in mobiunpack and should be reusable for the ncx index handling.
Ciao,
Steffen