Hi Kovid
I was basing that on the parsing done by mobiunpack.py to get the starting offset of each section. The difference in starting offsets determines the section length.
Since section 0 contains the extended header, it;s size is the difference in the starting positions of section 0 and section 1.
For my test case under Calibre this provides:
going to load section 0 now
loading section 0
before: 2912 and after: 3472
as the starting offset and the ending offsets. This provides a size of 3472-2912 = 560 bytes for the extended header (section 0)
For my test case under KindleGen this provides:
loading section 0
before: 3816 and after: 12484
as the starting and ending offsets. This provides a size of 8668 bytes.
Perhaps there is a bug in mobiunpack.py in how it does sections but if you actually open the KindleGen produced book in emacs, you can see the almost 8000 bytes of nulls right where it says it should be.
Here is the code snippet that does the sectioning in mobiunpack.py (for what it is worth).
Code:
class Sectionizer:
def __init__(self, filename, perm):
self.f = file(filename, perm)
header = self.f.read(78)
self.ident = header[0x3C:0x3C+8]
self.num_sections, = struct.unpack_from('>H', header, 76)
print "number of sections ", self.num_sections
sections = self.f.read(self.num_sections*8)
self.sections = struct.unpack_from('>%dL' % (self.num_sections*2), sections, 0)[::2] + (0xfffffff, )
for z in xrange(self.num_sections):
print z, " ", self.sections[z]
def loadSection(self, section):
print "loading section ", section
before, after = self.sections[section:section+2]
print "before: ", before, " and after: ", after
self.f.seek(before)
return self.f.read(after - before)