View Single Post
Old 01-04-2009, 02:21 PM   #17
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by tompe View Post
The code is not removed so it is in the packed data (I just checked). Does it work with mobigen?
It does work with mobigen. The issue that the Mobipocket renderers (well, Mobipocket Reader Desktop 6.2 anyway) don't actually use the markup to find uncrossable boundaries. Instead, they use a "trailing data item" appended to the end of each text-content record. The details are:
  1. The MOBI header version field(s) must be >= 6.
  2. The MOBI header extra data flags field must have 0x4 set.
  3. Each text record has a trailing data item in the standard form (<data><size>, where <size> is a Mobipocket variable-length encoded integer and includes the size of <size>).
  4. The data consists of a running index of the offsets of uncrossable boundaries, represented by a series of variable-length encoded integers.
  5. The value of each integer is the offset from the position specified by the previous entry (or the beginning of the record for the first) to the last byte of the boundary tag (the '>' character) and right-shifted 3 bits. The "previous offset" for each entry is not the exact previous offset, but the actually-encoded offset & ~0x7.
I'm actually not 100% sure about the "last byte of the boundary tag" business, but that's how it seems to work. I'm planning to add this plus a few other things I've figured out to the MOBI page on the wiki at some point.

-Marshall
llasram is offline   Reply With Quote