View Single Post
Old 01-14-2017, 09:46 AM   #45
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
more on zip filename encoding

@slowsmile
Quote:
Originally Posted by slowsmile View Post
@KevinH...Altering the zip file name encoding to utf-8 using zipinfo flags might work on Linux and Mac but apparently, according to the python issue tracker, it still doesn't work on Windows.

Python Issue27344

I also found out that the -U switch will work on the command line version of PKZip or WinZip to add utf-8 file names to the Zip file. But no apparent capability for this exists yet on Windows because this issue is still open for resolution. Very frustrating.
Actually, that bug post says the internal zip library in Python 2.x and 3.x does work with unicode and utf-8 filenames, it is the the documentation that is wrong. If you use python's zip module to create zips and unzip things, unicode filenames do work on all platforms. That said, if a user on Windows does not use the python zip module and unpacks a zip using the Windows builin zip utility, most likely the utf-8 encoded filenames will not be properly unpacked.

Given the plugin can and should be using the python zip module to create the epub Sigil uses its own zip module to handle things, the fact that the Windows builtin zip utility is broken should not matter.

FWIW, you should also make sure that any hrefs used in the content.opf or in links throughout the document are properly url encoded to preserve any non-ascii chars used.

Hope his helps,

KevinH
KevinH is offline   Reply With Quote