View Single Post
Old 03-23-2025, 09:06 AM   #13
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,893
Karma: 6120478
Join Date: Nov 2009
Device: many
No spellcheck does not ignore user dictionaries that are properly specified but they do follow and use the lang and xml:lang attributes religiously to determine which dictionary to spellcheck a potential word in.

But in this plugin's case, there are no xml:lang or lang attributes on span tags on foreign words so it will default to either the xml:lang or lang attributes on the html tag or if those are not present, the first dc:language metadata tag in the opf.

So it is a bit of a catch-22 here. You need the lang attributes to know what dictionary to look a word up in but you need spellcheck to determine if a word is potentially a foreign word or just an incorrectly spelled one.

Detecting single possible foreign words is possible with only one dictionary and you can generate a word list of probable foreign words. But until spans are added and merged and xml:lang/lang attributes for the proper foreign language added, there is no way to tell which dictionary to look up a word in.

I think the best approach is multi-pass. First to use a single dictionary to create a list of single foreign words, wrap those in span tags and add proper lang and xml:lang attributes to each.

Next pass is to use the find foreign words variant of this plugin, or just search for xml:lang and visually/manually fix any intervening adjacent words that were not properly detected to add the proper span and lang info.

Final pass, merge adjacent spans with matching xml:lang attributes.

Plugins can in fact use hunspell spellchecking directly, and you could inside a plugin force lookup a word to see if it exists in both dictionaries and if adjacent to an existing span, add it.

Then the entire process would be done inside the plugin.

Last edited by KevinH; 03-23-2025 at 09:12 AM.
KevinH is online now   Reply With Quote