Hi tkeo,
Still don't like the comparison against sys.maxint as that changes with machine. I simply want to check for one specific missing value 0xffffffff as we do with the start offset later on in KindleUnpack and many places in the header. I will fix that. If it is some other invalid value, I want to know that and let the program barf appropriately so we figure out how they have changed setting of CoverOffset. I will add my fix to the dump EXTH code as well. Also, do you have a specific testcase you use with that?
Thanks for catching the extra quotes bug in mobi_k8resc.py. I will remove the extra crs from prefs.py to keep it consistent with the other files.
Edit:
Here is how I am now handling the potentially missing CoverOffset issue (if that is what it even is). I am suspicious that someone has used an improperly written meta data editor and messed up the EXTH size fields somehow. If that is the case, I would rather we fail out as it will help us better detect where and when this is happening.
From mobi_header.py in parseMetaData(self)
Code:
if self.hasExth:
extheader=self.exth
_length, num_items = struct.unpack('>LL', extheader[4:12])
extheader = extheader[12:]
pos = 0
for _ in range(num_items):
id, size = struct.unpack('>LL', extheader[pos:pos+8])
content = extheader[pos + 8: pos + size]
if id in MobiHeader.id_map_strings.keys():
name = MobiHeader.id_map_strings[id]
addValue(name, unicode(content, codec).encode('utf-8'))
elif id in MobiHeader.id_map_values.keys():
name = MobiHeader.id_map_values[id]
if size == 9:
value, = struct.unpack('B',content)
addValue(name, str(value))
elif size == 10:
value, = struct.unpack('>H',content)
addValue(name, str(value))
elif size == 12:
value, = struct.unpack('>L',content)
# handle special case of missing CoverOffset
if id != 201 or value != 0xffffffff:
addValue(name, str(value))
else:
print "Warning: Bad key, size, value combination detected in EXTH ", id, size, content.encode('hex')
addValue(name, content.encode('hex'))
Thanks,
KevinH
Quote:
Originally Posted by tkeo
Hi Kevin,
I have tested with 10 mobi files, 2 of which have HD images and 1 of which has no RESC. The splitted files are identical to ones generated by older mobi_split.py.
I have fixed a bug in taginfo_toxml() of mobi_k8resc.py and modified mobi_header.py.
I have changed to 508 : 'Unknown_Title_Furigana?_(508)',
517 : 'Unknown_Creator_Furigana?_(517)',
522 : 'Unknown_Publisher_Furigana?_(522)',
in dump_contexth(cpage, extheader).
Those in class MobiHeader are not changed.
I have modified this part too since int('0xffffffff') cannot convert to an long integer.
Code:
>>> int('0xffffffff')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '0xffffffff'
>>>
I attach a patch. Hopefully, it is the final patch!
BTW,
prefs.py has CRLF line ending instead of LF.
Take care,
tkeo
|