Quote:
Originally Posted by tompe
But how do I then detect how many bytes there are in the trailing multibyte bytes? How can I know for sure which byte is the one giving the number of bytes? Or can you parse it in reverse order and it is not ambigious?
|
Right. You parse each trailing entry backwards. So if all 16 were present, you'd parse #16 at the end of the record, then #15, etc etc on through #1 last. I may have complicated understanding by on the Wiki leaving out the distinction between what I'm calling "forwards-encoded" variable-width integers and "backwards-encoded" ones. The sizes of trailing entries 2-16 are backwards-encoded variable-width integers, encoded with only the high (first) byte having bit 8 set, which means you can most easily read them backwards. So yeah -- start from the end and work backwards
.
This is Calibre's current code for find the total size of the trailing entries:
Code:
def sizeof_trailing_entries(self, data):
def sizeof_trailing_entry(ptr, psize):
bitpos, result = 0, 0
while True:
v = ord(ptr[psize-1])
result |= (v & 0x7F) << bitpos
bitpos += 7
psize -= 1
if (v & 0x80) != 0 or (bitpos >= 28) or (psize == 0):
return result
num = 0
size = len(data)
flags = self.book_header.extra_flags >> 1
while flags:
if flags & 1:
num += sizeof_trailing_entry(data, size - num)
flags >>= 1
if self.book_header.extra_flags & 1:
num += (ord(data[size - num - 1]) & 0x3) + 1
return num
HTH!