View Single Post
Old 01-23-2015, 05:41 AM   #689
sadowski
Connoisseur
sadowski ought to be getting tired of karma fortunes by now.sadowski ought to be getting tired of karma fortunes by now.sadowski ought to be getting tired of karma fortunes by now.sadowski ought to be getting tired of karma fortunes by now.sadowski ought to be getting tired of karma fortunes by now.sadowski ought to be getting tired of karma fortunes by now.sadowski ought to be getting tired of karma fortunes by now.sadowski ought to be getting tired of karma fortunes by now.sadowski ought to be getting tired of karma fortunes by now.sadowski ought to be getting tired of karma fortunes by now.sadowski ought to be getting tired of karma fortunes by now.
 
Posts: 84
Karma: 1142796
Join Date: Jul 2009
Device: Sony PRS 350, Kobo mini, PB mini
Dictionary: fuzzy stardict search

The "fuzzy" search algorithm that stardict implements can give inadequate results sometimes like in this example:
"handen" (Swedish "the hand", article -en is appended to "hand")
returns 9 not even close hits in this order (from a Swedish dictionary):
anden, banden, bandens, handel, handeln, handels, hinden, hindens, hunden, tanden
but not the correct match "hand" which is indeed contained in the dictionary.

There seem to be 2 problems with this look-up algorithm:
1. If there is no exact match, stardict falls back to a fuzzy search, allowing character replacements/insertions/deletions everywhere in the word. It would be more adequate to return a list of words starting with the query.

2. In most Europen languages, words roots can be found by manipulating endings, e.g., typically --> typical. This is language specific but makes dictionaries much more efficient.

Anyone else stumbled over this? Any suggestions?

Jens
sadowski is offline   Reply With Quote