03-13-2022, 08:48 AM | #1 |
Junior Member
Posts: 2
Karma: 10
Join Date: Feb 2021
Device: Windows 10
|
Spellcheck Sigil 1.91
High has anyone noticed that spellcheck in Sigil picks up words like the name of a person such as Clayton at the end of a sentence and includes the full stop and shows both Clayton and Clayton., as being incorrect. This behavior appears to be a recent event.
|
03-13-2022, 09:08 AM | #2 |
Sigil Developer
Posts: 7,683
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Yes, the built-in dictionary has enabled proper spelling of abbreviations and things like "etc.". Before, the dictionary improperly accepted "etc" as being correct and it spell checked "F.B.I." as 3 separate words "F", "B", and "I". These short incorrect "root" words are then used to generate multiple bad suggestions.
The new dictionary allows the the ending period to be included with unknown words in case they are abbreviations. So words like "etc." and "F.B.I." can be checked properly. It then checks the word with and without the ending period and if that word is not in the dictionary, it reports it as misspelled. The suggestions generated will include words variations with and without the ending period so the user can select what that ending period is meant to do. If you want to revert to the earlier behaviour, install an older hunspell dictionary into your Sigil Preferences and it will disable that new feature as the dictionary itself has to be designed to support proper spellchecking of abbreviations. |
Advert | |
|
08-30-2022, 06:55 AM | #3 | |
Imperfect Perfectionist
Posts: 472
Karma: 724664
Join Date: Dec 2011
Location: Ølstykke, Denmark
Device: none
|
Quote:
WORDCHARS -. Remove the period (and restart Sigil, if it is running), and the spellchecker will revert to (what I call) normal, that is no double entries of "misspelled" words with and without a period. (And of course, if the dog in reality lies buried somewhere else in the spellchecker, it would be nice to know the correct way of toggling it.) Regards, Kim Last edited by elibrarian; 09-02-2022 at 03:48 AM. Reason: Removed wrong information about other setting |
|
08-30-2022, 08:16 AM | #4 |
Sigil Developer
Posts: 7,683
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Sigil creates its own dictionary to enable this feature. No hunspell standard dictionary does this that I know about. Yes simply removing the period from WORDCHARS in any aff will change the break iterator algorithm resulting in the return of the old behaviour with etc and F B I being considered correct.
|
08-30-2022, 06:26 PM | #5 | ||||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
The changelog said:
Quote:
Quote:
The amount of acronym-periods vs. sentence-ending-periods are astronomical. And to heavily weigh Spellcheck Lists:
It also completely distorts the sortability/skimmability of lists + accuracy of word counts + makes it more annoying to "Ignore" all cases of a misspelled word. There just isn't any comparison in my mind. The amount of correct acronyms would be heavily outweighed by "wrong" spelling (red squigglies) on every single book's usage of sentence-ending periods. Quote:
I'll continue being grumpy about this change... but great work on the rest of Sigil's enhancements though! Still my favorite EPUB editor ever! Last edited by Tex2002ans; 08-30-2022 at 06:40 PM. |
||||
Advert | |
|
08-30-2022, 07:04 PM | #6 |
Grand Sorcerer
Posts: 27,577
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
It's not like there's any danger of running out of misplaced grumpiness anytime soon.
But I have to ask: why waste time being grumpy when you can substitute your own hunspell library (or libraries) that work exactly like you want? There's no point in suffering when the workaround is easy enough is there? |
08-30-2022, 07:45 PM | #7 |
Sigil Developer
Posts: 7,683
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Plus this occurs only with words that are *not* in the dictionary at the end of a sentence. An example here might be a proper first name. Words in the dictionary that end a sentence are properly checked. So comparing end of sentence words against abbreviations and acronyms is not the correct comparison.
And as DiapDealer said, just install your own hunspell dictionary that is not built to handle abbreviations if you so desire. |
08-31-2022, 01:29 AM | #8 | |
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
While I was gathering this info, I believe I found a bug in Sigil 1.9.10:
1. Paste this code into a book: Code:
<p>This is an example of a <i>sentence-ender</i>. That continues for another sentence.</p> 3. Search: . The period after "sentence-ender" will show up as a "word" by itself. (Because of the closing italic + .) - - - Enhancement Request: Think we could get the Spellcheck List "Count" column aligned right? Similar to my 2021 Reports columns alignment request! For easier readability/comparability. - - - Enhancement Request #2: Think we could get the "Language" + "Mispelled?" columns flipped? Before: Code:
Word | Count | Language | Misspelled? ____________|_______|________________________|_____________ example | 1 | English (United States)| No examples | 2 | English (United States)| No exampled | 3 | English (United States)| No exampleness | 4 | English (United States)| Yes español | 5 | Spanish | No Code:
Word | Count | Misspelled? | Language ____________|_______|_____________|_____________ example | 1 | No | English (United States) examples | 2 | No | English (United States) exampled | 3 | No | English (United States) exampleness | 4 | Yes | English (United States) español | 5 | No | Spanish Plus, if you didn't care about Language, you could easily resize the window slightly to chop it off. :P (And some of those Languages are REALLY verbose!) - - - Quote:
But for a common user (like Mbear or GreggBell) to know where to dig and "fix this" and substitute with an old hunspell, it's just madness. Put a foot in the normal human's shoes for a second! - - - - Introductory Note: I believe making this acronym/sentence-ender optional would help, similar to:
Maybe that UI can be adjusted slightly, and a new checkbox introduced:
(Or some much better name. ) (And, personally, I'd argue for it to be OFF by default. See reasoning below.) (Advanced users can then turn it ON if needed—just like "Check Numbers".) - - - - The Big Picture on Closing-Periods-as-Words The largest problem I have is:
The entire purpose—and extreme power—of Spellcheck Lists is to be able to compare/sort + get accurate counts. This allows you to quickly see, at-a-glance, problems which would have otherwise been hidden or very hard to spot: Code:
peeked | 5 peaked | 1 Rothbard | 100 Rothbird | 2 Rotbard | 1 Malone | 20 Molone | 2 Mises | 50 Misses | 3 Code:
peeked | 3 peeked. | 2 peaked | 1 Rothbard | 60 Rothbard. | 40 Rothbird | 1 Rothbird. | 1 Rotbard | 1 Malone | 19 Malone. | 1 Molone | 1 Molone. | 1 Mises | 40 Mises. | 10 Misses. | 3
- - - Lost in the Clutter For example, before, these show up right next to each other: Spoiler:
Now, you have a "visual clutter" full of: Spoiler:
This at-a-glanceness gets worse when you SORT by Count: Spoiler:
How many times did "Malone" show up in this book? 20 times. But not according to the period-sort! 19+1.
This makes comparison extremely hard. - - - Side Note: One of most common typos is seeing a spelling 10+ times, and a similar spelling 1 time, think: Code:
color | 10 colour | 1 Code:
color | 9 color. | 1 colour | 1 - - - See some comparison images. Before vs. After: Much fewer words per screen, and near-words (or typos) get lost under all the "duplicate periods". Plus, your eyes are always "stutter-stepping", because of the:
When 99% of these extra "words" are duplicate periods... your brain turns off. - - - That's the major problems, as I see it, but I've got many other intermediate/smaller ones too. Here's a few I picked out of the last book I worked on: Problem #1: Indexes/PageNumbers (especially roman numerals) Similar to Spellcheck Lists in Calibre getting flooded with numbers, this index/roman numeral issue also kicks it up to 1000! You get hundreds and hundreds of extra: Code:
i. ii. ix. I. II. III. Problem #2: URLs Lost One trick I love/d to use is a period to find yet-to-be-linked URLs in a book: URLs are completely lost in Spellcheck Lists now. Not just by (partial) acronyms but by every single sentence now too! Problem #3: URL "sentence-enders" I've seen lots of nearly-doubling:
When you work on citations, this becomes a huge problem! - - - Do I need to continue? I have lots more examples! The current way of Sigil's new default spellchecking is unlike any other program there is—and not in the good way! - - - Anyway, like I said, a nice, easy-to-use checkbox would be a nice addition. (OFF by default!) Then we could say:
Those users who know what they are doing + have specific cases for it can enable it. But for the love of all that is holy, put advanced stuff as options for the advanced users! - - - PS. I still love you, KevinH and Diap, but sometimes I want to just hug your little necks with two hands! Last edited by Tex2002ans; 08-31-2022 at 03:39 AM. |
|
08-31-2022, 06:07 AM | #9 | |
Grand Sorcerer
Posts: 27,577
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
But "madness"?? That's a tad extreme. The ability to add custom hunspell dictionaries has long been a feature of Sigil. Was it madness to offer that ability? Don't get hung up on needing to get an "old" hunspell dictionary. "Old" was in the sense that any version of Sigil previous to 1.9.1 would have one. ANY non-Sigil-1.9.1 hunspell dictionary will suffice. It doesn't need to be "old". Last edited by DiapDealer; 08-31-2022 at 06:26 AM. |
|
08-31-2022, 06:15 AM | #10 |
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
The normal children—ages 3+!
Last edited by Tex2002ans; 08-31-2022 at 06:18 AM. |
08-31-2022, 07:58 AM | #11 |
Connoisseur
Posts: 52
Karma: 10
Join Date: Sep 2021
Location: Upstate NY, USA
Device: iPad Pro, Kindle basic
|
|
08-31-2022, 08:25 AM | #12 |
A Hairy Wizard
Posts: 3,114
Karma: 18727091
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
Way to throw mbear and GregBell under the bus!! Lol
I actually played around with this spellcheck stuff… and the periods after a duplicate word was a slight annoyance, but not that big a deal. Once you add the root word to a dictionary you can refresh the spelling list… it finds the word(s) in the dictionary and doesn’t display them in the misspelled list anymore. The refresh is very fast. The slight annoyance was from having to move the mouse back and forth between selecting the word and then clicking a button - when you have to skip every word with a period. Otherwise you would have both words and period words in the dictionary. Any way to make the spellcheck buttons return focus to the wordlist - preferably the word just below the previous location?? That will allow people to use the arrow keys to skip over any words (dotted or otherwise) while not moving the mouse. |
08-31-2022, 09:06 AM | #13 |
Sigil Developer
Posts: 7,683
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Just install any Hunspell en dictionary and the previous behaviour you want is restored. So no checkbox preferences or ui changes will be made. No editing of aff is needed. A dictionary that supports abbreviations and reduces spurious suggestions needs to be designed to do so. It uses different word lists. So using just a checkbox preference is not possible. But adding a hunspell dictionary is easily done by users.
Trying to misuse spellcheck to check for bad urls when a plugin exists to do that seems a bit over the top. I will look into reordering the columns and right adjusting the counts, just NOT for the next release as that would entail large changes in the User Guide spellcheck related images when we are just about to make a new release. Yes, adding the word to the dictionary will remove it sentence ending variant when a refresh is done. So just install your own hunspell en dictionary into your Sigil Preferences and you will never have to live with it. I work on many math heavy papers typically loaded with abbreviations and acronyms but with very few or next to none proper first names. I like the hunspell dictionary I built to handle that aspect more correctly. If you do not like that behaviour simply install any english hunspell dictionary (newer or older) just once to get the behaviour you want. And if you want to make accurate counts, I recommend using the new Saved Search Group Counts Report feature and not trying to use SpellCheck for that. It was added for just that purpose. Last edited by KevinH; 08-31-2022 at 10:18 AM. |
08-31-2022, 10:42 AM | #14 | |
Imperfect Perfectionist
Posts: 472
Karma: 724664
Join Date: Dec 2011
Location: Ølstykke, Denmark
Device: none
|
Quote:
As for how un-annoying it is, so - yes, if the text you are checking i in the same language as the spell checking dictionary, without many foreign word and places, so … But translated works, playing out in far & foreign places …. I have just spell checked Conan Doyle's Brigadier Gerard stories, and since the good brigadier gets around in the world, there's a lot of "misspelled" words at the end of lines - oh, SOO much fun. And I didn't find one - not one! – misspelled abbreviation, that would have justified the thing. Regards, Kim |
|
08-31-2022, 10:43 AM | #15 |
Sigil Developer
Posts: 7,683
Karma: 5433388
Join Date: Nov 2009
Device: many
|
ps. Since it is only a one line change that will not completely invalidate the user guide spellcheck images, I have aligned the count field right (numerically). Any change in column order will have to come in a future release just not our upcoming one.
|
Tags |
spellcheck |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Spellcheck Sigil 1.91 | Mbear | Introduce Yourself | 1 | 03-12-2022 04:39 PM |
Spellcheck | JoséEduardo | Calibre | 2 | 11-22-2018 12:25 AM |
Spellcheck in book view + selected text spellcheck | unfairrobot | Sigil | 2 | 12-19-2016 04:50 PM |
Multilanguage spellcheck | varlog | Sigil | 1 | 09-28-2016 11:45 PM |
SPELLCHECK NATION: Does SpellCheck have a dark side? | cbaehr | Self-Promotions by Authors and Publishers | 10 | 11-07-2010 12:45 PM |