@Kevin & @DiapDealer...I've got the epub_zip_up_book_contents() working in the plugin and when it converts the Polish ebook -- Brassia Grim -- to epub, the epub is exactly the same as before -- the file names displayed in the epub in Sigil's Book Browser are DOS Latin and not in UTF8 encoding showing Polish characters. The contents.xhtml toc items are also exactly the same as before.
I must also add that at no point in my plugin app do I handle read/writes to and from files in anything else but UTF8.
And in my desperate trawlings for more information about zip files on the internet I stumbled across what might be a rather large gorilla in the room.
I found out that the Windows NTFS file system(used on Windows 7, 8 & 10) uses UTF16 for all file names.
So here's another question:
Can python's ZipInfo object and flag bits be set to allow a UTF16 NTFS file name to be added to a zip file as UTF8? Or will the UTF16 filename automatically always revert to DOS Latin encoding instead in the zip archive? I'm asking this question because when I checked WinRaR's ability to change internal file name encoding by going to Options > Name Encoding in the app, there was no UTF16 option -- only UTF8.
Lastly, I'm quite open and willing to believe that python's ZipFile module can convert and store UTF8 file names, but as yet I have seen no evidence of this happening either in my module or after using the epub_zip_up_book_contents() function from the PLugin Framework. It also hasn't helped that Python's documentation appears to be absolutely nil concerning proper detailed descriptions of what the zipinfo flag bits do and how to use them. I'm now off to try and perhaps find some decent and reliable flag bit code from Nullege or Git Hub and the like.
|