View Single Post
Old 11-15-2012, 12:30 PM   #18
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,630
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi Kovid,

To workaround these issues, ePubFixer uses its own zipfile.py version called zipfilerugged.py which is simply the official zipfile.py file with this one change to explicitly catch the decode problem when garbage central directory filenames are used with the encoded as utf-8 flag (and they are garbage chars, not utf-16 as far as I can tell).

Code:
    def _decodeFilename(self):
        if self.flag_bits & 0x800:
            try:
                return self.filename.decode('utf-8')
            except:
	        return self.filename
        else:
            return self.filename
This prevents zipfilerugged.py from barfing out when simply trying to open the bad zip.

Then to fix the cases where central and local filenames differ (again because of garbage chars in some central directory filenames), ePubFixer uses the following code that imports the zipfilerugged.py

(see attached zipfix.py)

Your new approach of reading the entire zip by processing the local information only should be more robust and closer to what B&N is using but ePubFixer works too.

Hope this helps.

KevinH
Attached Files
File Type: zip zipfix.py.zip (1.6 KB, 227 views)
KevinH is online now   Reply With Quote