![]() |
#1 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,633
Karma: 29710510
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Speed up Spell Check list traversal
Using Sigil 0.9.2 on Win 10. My system disk is a 256MiB Toshiba SSD. My CPU is Intel Core i5 650 @ 3.20GHz Clarkdale and I have 6GB of DDR3 RAM - no over-clocking.
Is there anything I can do to make traversing the spell check word list with the keyboard arrow keys more responsive? I often overshoot by hitting down arrow too many times - then I have to sit back and wait until catches up. I tried dismissing preview and other windows to no avail. The size of the epub seems immaterial, I can't discern any difference between a 5000 word epub with single file, and a 300,000 word epub with 30+ files. Nor does the size of the word list seem to matter. I mistakenly thought it was a 0.9 versus 0.8 issue, so I installed 0.8.6 32 bit, made no difference. So it's me becoming less tolerant of sluggish response, and more averse to using the mouse. FWIW the calibre editor is similarly sluggish. If there's nothing I can do, then is there anything 'you' can do? It would appear the delay is caused by the code view window having to 'navigate' to the 'selected' word in the xhtml files. I'd be happy to forgo that happening automatically via a preference, providing there was a kb short and/or context menu to do it manually - most of the time I don't need it. BR |
![]() |
![]() |
![]() |
#2 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,912
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Hmmm!
You are observant. Arrowing the list in Sigil is way more sluggish than Calibre How many dictionaries are enabled? |
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,633
Karma: 29710510
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
Two real dictionaries (English and French) - both much larger than the dictionaries shipped with sigil and calibre - I found them at Apache (I think). Some smallish user 'dictionaries' in Sigil, none in Calibre because I don't ordinarily use its spell checker. Calibre is much slower loading the dictionaries. I can't discern any tangible difference between traversing its list of mispelt words and Sigil's - one's like porridge, the other's like treacle ![]() BR |
|
![]() |
![]() |
![]() |
#4 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,487
Karma: 5703586
Join Date: Nov 2009
Device: many
|
Forgive my ignorance here because I normally have it highlight misspelled words in CodeView with a squiggly line and then scroll through looking at them. I rarely invoke the spellcheck window.
Alternatively, I can hit the Spellcheck menu itself and it brings up a list of every misspelled word in the document. I can then use the down arrow to bring up the next word and it brings up the first place this misspelled word exists no matter where it is in my document. I can hit the down arrow as fast as I can and never see much if any delay. This is with the updated (now standard) Hunspell dictionaries that now come with Sigil (ie. one that properly uses affixes - not the horrible one that was just a long word list Sigil had previously). The 60K word list with affix file actually properly checks many many more words than the long word list one ever could. You might want to see if reverting to the Sigil standard hunspell dictionaries impacts the issue. The person who set up spellchecking on Sigil previously, really did not understand affix compression and how it works. Being the author of the original MySpell that Hunspell was based on, I found it funny that most people thought dictionaries were just long lists of words! This process appears to be reasonably fast to me. So exactly what menu or Tool are you invoking and what exact buttons or keys are you hitting so that I can try to recreate what you are seeing. If it matters - my machine is an iMac 27 inch from Mid 2010. With 2.93 GHz Intel Core i7 with 16Gig of DDR3 ram. It has a slow harddisk (not ssd or fusion). My machine is close to 6 years old now and seems to be more than enough to handle spellchecking quite easily. KevinH |
![]() |
![]() |
![]() |
#5 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,633
Karma: 29710510
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Kevin
You're right it is dictionary related! For some reason I had discounted that possibility. I disabled my user 'dictionaries' and removed the ones I got from Apache, so I think I'm using these from C:\Program Files\Sigil\hunspell_dictionaries Code:
en_GB.aff 77,920 2015/12/18 16:25:04
en_GB.dic 758,163 2015/12/18 16:25:04
Is it possible to use another hunspell dictionary, such as one of these ==>> LibreOffice -Dictionaries. Incidentally they're the same dictionaries I previously obtained from Apache. And if so how should I do that. Here's the innards of one of them Added : not only is traversing the word list lickety split, starting Sigil is also much faster (1 second rather than 3-5 seconds). BR Last edited by BetterRed; 02-24-2016 at 07:44 PM. Reason: correct folder name para 2 |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,487
Karma: 5703586
Join Date: Nov 2009
Device: many
|
When I took over Sigil, I updated the dictionaries to be the best ones from Apache/Open/Libre office I could find that had compatible licenses. The key is to look at the .aff file to make sure it has the most common prefixes and suffixes listed, and the make sure the .dic file actually uses those prefixes and suffixes, and is not just long list of words. A good dic, will show a list of root words with single char suffix and prefix flags attached to each root word entry. In this way a shorter list of root words and flags can represent a much much longer list of words than is possible without. It will load quicker and will do a better job. That is the whole purpose of using an affix approach.
So unpack that dictionary and head the .dic file and look for root words with flags (or post a snapshot here). If it doesn't use affix compression, it is generally not worth using. KevinH |
![]() |
![]() |
![]() |
#7 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,487
Karma: 5703586
Join Date: Nov 2009
Device: many
|
BTW,
If you do find a long list of words you like, there is a program the used to be part of MySpell, that will let you use an existing .aff file of prefixes and suffix definitions, and it will crunch on the long list and create a much shorter list of root words and flags you can use to replace it exactly, which will load much much faster and work better. I will have to see if I still have my original copy of MySpell someplace in case you want to try it if that particular dictionary is just a long list of words. KevinH |
![]() |
![]() |
![]() |
#8 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,633
Karma: 29710510
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Thanks Kevin,
Indeed the KPP .dic file is just a long list of words. Interestingly the .aff file appears to be a 'slightly' earlier version of the one you ship. For the root word 'attain', the KPP dictionary has 21 words, your GB dic has 7. Just 1 of the 21 words raises a spellcheck error - attainablenesses ![]() Definitely not a case of bigger is better, the KPP .dic file is almost 10 times the size of the one you're shipping ![]() I'll allocate some time over coming weeks to spell checking some 'finished' books with 'your' dictionaries before I do anything further - providing it gets a cooler, was 47°C (120°F) in some parts of Sydney today, it's still 31°C at 8:45 pm. ![]() Thanks for putting me on the right track. BR Last edited by BetterRed; 02-25-2016 at 06:03 AM. |
![]() |
![]() |
![]() |
#9 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,487
Karma: 5703586
Join Date: Nov 2009
Device: many
|
The other issue is that people think the spellcheck function should be like the OED in completeness. That is simply not true. The more obscure words in the dictionary there are will just help to hide many of the most common mistakes made. I read that Shakespeare used only about 15,000 different words in all of his plays, where most people use between 3,000 to 6,000 words. These huge long word lists are really a mistake. I think the program that uses affixes to compress the wordlist must have been lost in the transition from MySpell to Hunspell somehow, as no one seems to understand what the .aff file is for and how to use it. Sad really.
KevinH |
![]() |
![]() |
![]() |
#10 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
I wrote about this before in one of my posts when discussing common OCR errors I have run across over the years + problems with dictionaries that are too encompassing.
I tend to prefer doing spellcheck in Sigil instead of Word, because I find Word allows too many words through as "correct". If I remember correctly, I believe Calibre considers the pre-hyphen and post-hyphen words separately, while Sigil treats them as entire words(?). I tend to prefer the Sigil method in that case as well! |
![]() |
![]() |
![]() |
#11 | ||
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,633
Karma: 29710510
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
FWIW Word has a little known exclusion feature, it's just a list of words you can maintain for each installed dictionary, they're installed as empty files in %AppData%\Roaming\Microsoft\UProof - e.g. ExcludeDictionaryEN0c09.lex Quote:
Maybe Sigil could do something similar, in the above example 'considered' is ninth in the list of suggested corrections. @KevinH - was your program that took a word list and applied the affix rules to create an optimal .dic file called 'munch'? There's a program in hunspell-tools of that name that seemingly does something very similar. ![]() ![]() BR Last edited by BetterRed; 02-25-2016 at 04:13 PM. |
||
![]() |
![]() |
![]() |
#12 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,487
Karma: 5703586
Join Date: Nov 2009
Device: many
|
Hi BetterRed
Yes, it was called munch in that it would "munch" a long list of words into a set of rootwords with affix flags. Having a good .aff file is a requirement to use munch. Glad to hear it it made it into hunspell. I wonder why no one seems to use it? KevinH |
![]() |
![]() |
![]() |
#13 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
I initially brutalized the "Search" in Tools -> Index to find all of the hyphenated words in a book. ![]() Quote:
|
||
![]() |
![]() |
![]() |
#14 | |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,633
Karma: 29710510
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
BR |
|
![]() |
![]() |
![]() |
#15 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,487
Karma: 5703586
Join Date: Nov 2009
Device: many
|
Yes, his manual link went to the original doc I wrote on how to create prefix and suffix rules many many years ago. I am glad to see it still exists! I see my MyThesaurus is still being used too by some projects. I retired as the maintainer of the Openoffice lingucomponent project when it just was not fun anymore. Too many people wanted too many things and it became like work. Luckily the author of hunspell took over and expanded on MySpell to support languages with large amounts of compounding of words and even compounding of prefixes and suffixes. For latin based languages like English, MySpell did everything I wanted.
Take care, KevinH Last edited by KevinH; 02-25-2016 at 06:52 PM. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Spell check question | MerlinMama | Editor | 4 | 07-24-2015 03:45 AM |
Spell Check not working at all | shadowThief | Editor | 6 | 06-12-2015 09:04 AM |
Spell Check Undo | BetterRed | Editor | 2 | 06-15-2014 12:35 AM |
Spell Check | GeckoFriend | Sigil | 5 | 06-15-2012 03:09 PM |
how to use spell check | richreads | Sigil | 2 | 01-24-2012 10:13 PM |