I've spent some time in regularizing this Vocabolario_Italiano.txt. Looking into, there were a number of blank lines, repeated keywords, repeated entries, and the use of spacing was irregular. Moreover, I saw no capital in proper names. Possibly all this comes from an earlier conversion attempt.
I corrected what I caught, and uniformized three spaces as delimiter. I attach here my result.
With that I'm able to run Markismus' tool and generate an xdxf (in fact two, _reconstructed and _unbloated, I don't know what would be better) and the .dic again for Pocketbook.
What I was not able at all, is to generate the StarDict version from the xdxf. PocketBookDict didn't, despite my setting isCreateStardictDictionary = 1 in DicControls.pm and DicGlobals.pm (I don't know perl); the new version of pyglossary requires python3.9 while I am at 3.8, and with an older pyglossary generates me an empty .dict.
If someone wants to take over from here....
PS. the attached file contains a single & at line 69831 which may have to be substituted with & depending on the workchain.
Last edited by EastEriq; 04-22-2023 at 05:11 PM.
|