View Single Post
Old 12-12-2012, 02:25 PM   #159
ShellShock
Wizard
ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.
 
ShellShock's Avatar
 
Posts: 1,176
Karma: 2431850
Join Date: Sep 2008
Device: IPad Mini 2 Retina
Quote:
Originally Posted by tshering View Post
Today I run into a problem. I made a French dictionary. To my disappointment, no word with an accented vowel in the first two letters did show up. The reason for this was evidently the encoding of the filenames inside the zip file. This was a surprise for me, since my Japanese dictionaries work perfectly, so why should not the French dictionary? Anyhow, I tried zipping the file under Ubuntu, rather than under Windows, and the dictionary was working fine.
The culprit is a strange default behavior of 7-zip under Windows (or maybe I should rather say the culprit was me not being aware of it). 7-zip encodes only those filenames in utf-8 that contain characters not supported by the local codepage. Therefore, one has to enforce utf-8 encoding, under Windows.
Code:
7z a -tzip -mcu=on dicthtml.zip *.html words
Great find!
ShellShock is offline   Reply With Quote