View Single Post
Old 11-06-2012, 12:19 PM   #2
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi,

There are many broken epubs out there (especially from B&N)! These epubs do NOT follow the zip or epub specifications. Epubs are supposed to be zip files.

One form of breakage is to use garbage chars or full utf-16 unicode in the zip central directory filenames and then set the flag that indicates the names are utf-8 encoded. Another form of breakage is to not have the zip central directory filename match the the local filename and most zip access programs use the broken central directory name over the local name to prevent security attacks.

This completely breaks the python standard library for accessing zips (zipfile.py). The only way around this is to create your own zipfile.py and look for and catch central filename decoding errors to work around this nonsense.

If you are desperate, we can post for you an ePubFixer program (that requires you to have Python 2 installed with Tk widgets (see ActiveState Active Python 2.7 if on Windows, Macs and Linux are all set to go) that will read in the broken epub and write out a fixed epub, that should then work with calibre properly.

The long term solution is for calibre to implement its own zipfile.py code (if it does not do that already) and handle the special case of improper utf-8 flags being set on garbage central directory filenames.

The better solution if this is a B&N epub, is to send the ebook back and request an epub that actually meets the epub specification!

Hope this helps,

KevinH
KevinH is offline   Reply With Quote