Originally Posted by kovidgoyal
That's because the extra data flags in your mobi are incorrect. They should be:
0b11 (assuming the only trailing data is multibyte overlap and indexing)
Instead, they are
This causes the reading of the trailing data to be incorrect.
The trailing data flags (0b1011 = 0xB) were generated by Kindlegen 1.1 and the trailing data flags (0b11 = 0x3) were generated by Kindlegen 1.2, so I don't understand how either can be "incorrect"--they are what they are.
The original mobiunpack code interpreted trailing data flags with more than one bit set above the lowest bit as indicating more than one TBS, but this didn't seem to be the correct interpretation in the case of Kindlegen and Amazon-generated documents (the HTML content goes right up to the one and only TBS), so I fixed the number of TBS at one regardless of the traing data flags.
By the way, the last byte in a TBS is of the form 0x8n where n is the number of bytes in the TBS. If you look at the raw MOBI files you'll see that each HTML record is followed by exactly one TBS.
EDIT: You posted the last while I was typing