No spellcheck dictionary can tell what the original author meant. It is not a grammar checker and it does not know parts of speech, and nor can it see the words that surround it.
So they take the misspelled word and look for correct words that are only 1 edit distance away, then they try swapping adjacent chars, then they try inserting a new character at every position (including a space), then they run through the replacement table provided in the .aff file, then if phonetic changes are enabled in the aff, they will try those, and finally if still no good words found they will use ngrams to make a suggestion.
So the word Theater which is not spelled correctly under en-GB is modified to try to look for "close" words that the original author could have meant.
In this case we get the following list:
Heater
Cheater
T heater
Th eater
Th-eater
Thea ter
Thea-ter
Theatre
Treater
Heather
which are all only 1 character edits, swaps, or insertions.
"Thea" is being generated as a proper name for someone and "ter" is a known abbreviation for "Total Expense Ratio", etc. Having things like "ter" and "th" be considered "words" is generally not a good idea but scowl obviously included them at some point to make things like 105th work most likely.
Here is where those pieces come from in scowl:
english-upper.50:Th
english-abbreviations.70:ter
english-proper-names.50:Thea
The spellchecker has no way to know what you meant by Theater, and based on small changes - it could be any of these valid combinations.
When suggesting, the case is changed to match that of the misspelled words case which makes Thea (a woman's proper name) quite likely as it would need no case change.
If you try the lowercase version "theater" you will get a much smaller list of suggestions as its case rules out proper first names.
Hope this explains things a bit better.
This is a great illustration of why adding proper first names to the spellchecker and bunches of abbreviations is not the best idea.
You might want to try the size 60 en-GB dictionaries to see if you like those better as they should have fewer proper names and abbreviations without periods in them which should prevent them from being considered as valid suggestions.
And it a misspelled word begins with an uppercase letter, be prepared for proper first names to be part of the suggestions.
People complain when a first name is marked as not correctly spelled and they put pressure on the spellchecker to include it, but they really make no sense. One of the reasons is that some programs that use spellcheckers do not allow user word lists to be kept, edited, and used which in turn leads to main dictionary bloat.
Hope this helps.
Quote:
Originally Posted by Ashjuk
I am still a little puzzled by how the suggested replacement words works.
Today I right clicked 'Theater' expecting the first word in the replacement list to be Theatre, but not so. The suggested replacements for Theater are:
Heater
Cheater
T heater
Th eater
The ater
The-ater
Heather
Thatcher
Theatre does not even make it on the list. Yet if I right-click center the first word it offers is centre.
Why is that?
|