View Single Post
Old 01-25-2022, 05:49 PM   #57
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,275
Karma: 6565382
Join Date: Nov 2009
Device: many
Thanks ... I found a second list of the hyphenated words and it had over 47000 entrees.

I think the only solution is to remove the "-" from the WORDCHARS and accept anything that is okay when split into separate words at the hyphen. It seems scowl was based off this assumption and I really do not want to have to add and maintain 40+ thousand hyphenated words. Our old dictionary used this approach as well.

It is always something!

And I learned something new. Hunspell is smarter than MySpell and will actually try splitting hyphenated words and checking each one automatically whether there is a "-" in Wordchars or not.

So that means we really only need to add hyphenated words that are not already covered by that rule which seems to cover a lot of words, just not self defense or self-defence.

So it looks like I can take my list of 47000 hyphenated words and spellcheck them using hunspell to see how many of them actually need to be special cased other than self-defense/self-defence.

Last edited by KevinH; 01-25-2022 at 06:42 PM.
KevinH is online now   Reply With Quote