View Single Post
Old 09-08-2022, 01:24 PM   #3
Markismus
Guru
Markismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicing
 
Markismus's Avatar
 
Posts: 955
Karma: 149907
Join Date: Jul 2013
Location: Rotterdam
Device: HiSenseA5ProCC, Cracked OnyxNotePro, Note5, Kobo Glo, Aura
XML is a language for data storage, not a dictionary format. So you can't expect pyglossary to support any XML whatsoever. However, you can put the XML-file online and post a link to it. Maybe you're lucky.

Suggestions
The Pdf-(2-epub-)2-html-2-stardict tool isn't there, yet. Probably never. The problem is that the nice styling of a PDF puts a lot of extra code in there, that has to be differentiated from the words&definitions. Optically easy, but not code-wise. You could try to get ABBYY Finereader to recognize it and specify the output format as a spreadsheet or CSV-format. However, even ABBYY's output will still have a lot of noise, that you'll have to deal with.

What is the name of the dictionary? Maybe it's already present in a nicer format than PDF.
Markismus is offline