View Single Post
Old 04-23-2023, 04:23 PM   #11
EastEriq
Groupie
EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.
 
Posts: 199
Karma: 195502
Join Date: Jan 2018
Device: Cybook Orizon, PocketBook Touch HD
Good to know that Pocketbook .dic format has been decoded to some extent.

I also understood that your gist works correctly only in python 3.9, not in 3.8 nor in 2.7.

I quickly compared the results of your gist on either VocabolarioItaliano.dic of the original post, or on VocabolarioItaliano_reconstructed.dic which is the result of Markismus' PocketBookDict, and I see that the results differ mostly for spacing (which I corrected), and ordering of words with accented vowels (the result from VocabolarioItaliano_reconstructed.dic has them right imho, alphabetizing accented vowels as if they were not accented, as customary in italian, rather than listing them in code order at the after all unaccented letters). But I also spotted cases in which one file includes the acception (1) of some term, where the other the acception (2) (e.g: abbaglio, abbonare, acapnia, accapponare). We don't know how the original VocabolarioItaliano.dic was really generated, but I remark that these multiple acceptions are all present in the VocabolarioItaliano.txt of the original attachment. @Markismus, bug in your program? As for abbozzare, I have 3 entries in the csv I attached, wheras your result holds only (1).

Not sure what to do with _info.txt , which contain as Wordlist only a few of the terms, with some missing the initial (e.g. "ccozzaticcio", "d usum delphini" "pparenza"), but whatever...
EastEriq is offline   Reply With Quote