Quote:
Originally Posted by HarryT
Purely as a matter of interest, what is "linguistic compression"?
|
Linguistic compression was originally developed to save disk space and RAM. For example, most
file compression methods use dictionary-based algorithms.
According to the Mobipocket website, Mobipocket/Kindle dictionaries use the
Levenshtein distance algorithm to keep the file size down. For more information see this
Mobipocket website article.
Quote:
Originally Posted by tuxor
This is not support for morphology and/or inflections. This is support for synonyms, as I already mentioned above.
|
You obviously don't understand the difference between inflections and synonyms. For example, "brought" is an inflection not a synonym of "bring" and any dictionary software that allows users to find a headword by searching for an inflected form
does support inflections.
Quote:
Originally Posted by tuxor
Btw: If the OP had talked about simple morphology simulation using synonyms, he wouldn't have reported that this is not working with ColorDict because ColorDict has perfect support for synonym files!
|
No s/he did not. If you re-read his/her post you'll find that s/he mentions that searching for "worked" didn't bring up the entry for "work" and "worked" just so happens to be an inflection of "work." (Not once did s/he mention synonyms.)
Quote:
Originally Posted by tuxor
|
That page explains the binary format used by StarDict, not the .babylon source format that I used.
Quote:
Originally Posted by tuxor
I already talked about "simulating" morphology support using the synonyms feature of StarDict. And it is definitely insane simulating morphology support in this way whenever the language is a bit more complex - for Hungarian, Ancient Greek or even German the synonyms file would become incredibly large and you wouldn't even end up with "proper" morphology support.
|
The synonym file would be indeed a bit larger, but it doesn't significantly delay the lookup speed. For example, I created an Arabic-English StarDict dictionary with more than 80000 entries whose lookup speed is about the same as other languages on my ancient iPhone, even though most entries had on average 35+ inflection definitions.
You really may want to do some actual tests instead of purely relying on third hand information!