Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 02-27-2016, 06:57 PM   #1
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,583
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Spellchecker - enhancements

Would it be possible to:
  • add a button to save the current word list to a file, preferably with the counts comma separated, but without would be more than adequate.

    I've tried ctrl+a, ctrl+c but that didn't work (nor did I really expect it to) - but I'd prefer a button and that would be more obvious to other users.
  • add two options to the word list context menu

    • Change to the suggested word - assuming the topmost in the suggested list is correct, which it often is.
    • Copy word to clipboard, I know ctrl+c works but...

      I navigate my way down the list with the down arrow key, on a Windows keyboard the context menu key is almost adjacent to the down arrow key, so I use the context menu to Ignore, Add... and Find... -- so why not Change... and Copy...
  • add a button (and/or word list context menu item) to clear ignored words, there's an unassigned keyboard shortcut to do it, but... too hard to remember them all

    I have four Sigil-like spell checkers, the others are -- calibre editor, Notepad++ and Epub Tools. None of them, Sigil's unassigned kb short aside, seem to have an easy way to do this - AFAIK, they all require one to exit and restart the spell checker to reinstate the Ignored word list.

    In texts with many Proper Names I initially ignore them all and deal with rest. After which I return to the Proper Names, validating them isn't just a matter of spelling, it's also a matter of being factually correct - e.g. Stewart and Stuart are not the same name, and in a political-history context it really matters which one is used

  • add a feature deal to better deal with misplaced hyphens. If a mispelt word has a hyphen - eg 'con-sidered' - then if removal of the hyphen yields a valid word - i.e. 'considered' - then offer that word as the first choice in the list of suggestions.

    As I said elsewhere it's currently ninth in the list - I appreciate that's the doing of hunspell. And as I've also said elsewhere the apparently misplaced hyphen is sometimes deliberate - automatic removal would thwart the author's intent.
BR

Last edited by BetterRed; 02-27-2016 at 07:10 PM.
BetterRed is offline   Reply With Quote
Old 02-27-2016, 09:12 PM   #2
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,651
Karma: 5433388
Join Date: Nov 2009
Device: many
BetterRed,

Quote:
[*] add a button to save the current word list to a file, preferably with the counts comma separated, but without would be more than adequate.
I really think that this interface is busy enough and does not need more buttons for things most people would never use. Why would anyone want a list of misspelled words?!? That said, I should be able to make the list of misspelled words copyable to the clipboard and you should then be able to paste it into any editor if you really want a copy of the misspelled words.

Quote:
[*]Change to the suggested word - assuming the topmost in the suggested list is correct, which it often is.
[*]Copy word to clipboard, I know ctrl+c works but...
What is wrong with ctrl+c? There is any number of suggestions, so have something that only worked on the top suggestion again makes no sense to me. Why not simply copy it and paste it where you need it. Too many ways to do something just leads to problems documenting how to use the spell checker interface.

Quote:
I navigate my way down the list with the down arrow key, on a Windows keyboard the context menu key is almost adjacent to the down arrow key, so I use the context menu to Ignore, Add... and Find... -- so why not Change... and Copy...
I will think about it, as it would not clutter anything visually too much. But this is something that will not happen until after full epub3 support has been implemented in Sigil since my free development time is all used up for that. That said, if you can convince any other developer to make these changes, I would be happy to accept pull requests as long as they do not break anything and don't clutter up the interface.

Quote:
[*] add a button (and/or word list context menu item) to clear ignored words, there's an unassigned keyboard shortcut to do it, but... too hard to remember them all
I am surprised there is a keyboard assignment for it. AFAIK, this is not possible without removing from the memory and reloading the complete dictionary. Ignored words are actually added to the main spellchecker dictionary that is loaded in memory for that ebook. When you "ignore" a word, you are telling the spellchecker that it should add its correct list temporarily. So clearing the ignore word list and restarting is the only easy way to deal with that. The solution is to simply NOT hit Ignore for misspelled words you want to revisit. You can skip to the next word without fixing it and return the next time around you do spell checking.

Quote:
[*] add a feature deal to better deal with misplaced hyphens. If a mispelt word has a hyphen - eg 'con-sidered' - then if removal of the hyphen yields a valid word - i.e. 'considered' - then offer that word as the first choice in the list of suggestions.
That is something for the author of Hunspell to consider adding as a feature. I have no plans to modify the Hunspell code we use inside Sigil to do this. It is purely stock hunspell code. The correct word should be someplace in the short list of 10 suggestions since their order is (was) based on ngram scoring. This is something more useful for OCR'd text and not reflowable text where hyphenation is done on the fly and not hard coded. So I recommend one of those plugins that do lots of automated clean-ups.

That said, wouldn't it be better to handle those words first by searching for all hyphenated words using a grep in Sigil's Find and Replace, see the word in context and decide if you want to keep the hyphen or not? Things like that are often better viewed in context of other text to see what the author actually meant. Then once that is done, you do the spellchecking.


Again, the best place for requests for improvements are as an issue on the Sigil github site. Many more potential developers will see it there rather than here, and some might be convinced to try and create a pull request that implements some of them. It also means it won't get lost or forgotten.

Take care,

KevinH
KevinH is offline   Reply With Quote
Advert
Old 02-28-2016, 04:54 AM   #3
elibrarian
Imperfect Perfectionist
elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.
 
elibrarian's Avatar
 
Posts: 466
Karma: 724664
Join Date: Dec 2011
Location: Ølstykke, Denmark
Device: none
Quote:
Originally Posted by KevinH View Post
Why would anyone want a list of misspelled words?!? That said, I should be able to make the list of misspelled words copyable to the clipboard and you should then be able to paste it into any editor if you really want a copy of the misspelled words.
Someone use such lists (I do … ). There's an extension for LibreOffice doing this - Linguist - it's not maintained, but still works in LO 5 (AFAIK the latest incarnation of it is Python, and someone might be able to tweak it into a Sigil Plugin) and several VBA-macros for Word doing this can be found around the net (the ones I've tried are very slow, though)

Regards,

Kim
elibrarian is offline   Reply With Quote
Old 02-28-2016, 09:40 AM   #4
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,651
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi,
For what purpose are those incorrect words used?

KevinH

Quote:
Originally Posted by elibrarian View Post
Someone use such lists (I do … ). There's an extension for LibreOffice doing this - Linguist - it's not maintained, but still works in LO 5 (AFAIK the latest incarnation of it is Python, and someone might be able to tweak it into a Sigil Plugin) and several VBA-macros for Word doing this can be found around the net (the ones I've tried are very slow, though)

Regards,

Kim
KevinH is offline   Reply With Quote
Old 02-28-2016, 10:09 AM   #5
elibrarian
Imperfect Perfectionist
elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.
 
elibrarian's Avatar
 
Posts: 466
Karma: 724664
Join Date: Dec 2011
Location: Ølstykke, Denmark
Device: none
Quote:
Originally Posted by KevinH View Post
Hi,
For what purpose are those incorrect words used?
KevinH
For my part, I use them to generate search&replacement lists.

In Denmark, we used to print books and papers in blackletter ("gothic") fonts up to 1915-1920 - my little niche of the danish book-market is reissuing some of those old texts in a form more readable to a modern reader. You can teach Finereader a lot, but not all - some of the letters are just to much like each other. However, there's usually some kind of system in the Finereader madness, and I can do mass replacements using such lists - prior to actually proofreading - with tools as wReplace and TransTools' Multiple Replace. It sometimes can improve the readability of the text immensely …

As far as I remember, the original author of the Linguist-extension for Libreoffice made it to generate lists of words not recognised by the danish spellchecker (of which he was one of the original creators).

Regards,

Kim
elibrarian is offline   Reply With Quote
Advert
Old 02-28-2016, 11:47 AM   #6
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,651
Karma: 5433388
Join Date: Nov 2009
Device: many
I looked at what calibre does. For every single word they spellcheck they use a regular expression replacement of the regular and short hyphens with nothing. If that shortens the word, they then spellcheck the shortened version first as a new word, if it passes they add it first to the suggestions, and then go and spellcheck the original word, and then test each new suggestion to prevent duplication with the no hyphen suggestion.

Sorry, Sigil is not going to go through all of that for a special case that only comes up for OCR text. Either a plugin or just normal find and replace can be done before the spellcheck to easily detect real hyphenation from OCR induced hyphenation and even better this would present the word in context.

Sorry.

KevinH
KevinH is offline   Reply With Quote
Old 02-28-2016, 08:59 PM   #7
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,583
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by elibrarian View Post
Someone use such lists (I do … ). There's an extension for LibreOffice doing this - Linguist - it's not maintained, but still works in LO 5 (AFAIK the latest incarnation of it is Python, and someone might be able to tweak it into a Sigil Plugin) and several VBA-macros for Word doing this can be found around the net (the ones I've tried are very slow, though)

Regards,

Kim


@Kim - FWIW the calibre editor will ctrl+c copy multiple words from its spellchecker word list.

Today I would paste them into upcoming release of The Sage - especially into the Word List tool, from there I can copy words to the Concordancer, Rhymer etc.

Click image for larger version

Name:	TheSageVII.jpg
Views:	222
Size:	149.1 KB
ID:	146775

Why ? Character and place name tracking, consistent misspellings - especially in dialogue - across multiple books. All sorts of things.

Curious - the Sigil spell checker seems to ignore 'words' that start with or maybe it's contain digits - is that by design? If yes - good, if not - don't fix it on my account

Maybe OCR is not the only source of misplaced hyphens. I've seen them in a number of purchased books that I'm pretty sure were not scanned. Apart from the misplaced hyphens there are none of the other OCR tell-tales. My guess is that they started life as SHY's and somewhere in the conversion hurdy-gurdy the SHY's were changed to regular hyphens.

BR
BetterRed is offline   Reply With Quote
Old 02-29-2016, 06:11 AM   #8
Divingduck
Wizard
Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.
 
Posts: 1,161
Karma: 1404241
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
Quote:
Originally Posted by KevinH View Post
I looked at what calibre does. For every single word they spellcheck they use a regular expression replacement of the regular and short hyphens with nothing. If that shortens the word, they then spellcheck the shortened version first as a new word, if it passes they add it first to the suggestions, and then go and spellcheck the original word, and then test each new suggestion to prevent duplication with the no hyphen suggestion.
KevinH
How it works for calibre. Take a look on the attached picture. It is part of user dictionaries and an example how it works with German words (we have a lot of bla-bla-bla hyphens )
The discussion around for implementation you can find here: https://www.mobileread.com/forums/sho...d.php?t=237869

I use this functionality very often in different situations.
Attached Thumbnails
Click image for larger version

Name:	User_dictioneries.JPG
Views:	201
Size:	286.8 KB
ID:	146778  

Last edited by Divingduck; 02-29-2016 at 06:21 AM.
Divingduck is offline   Reply With Quote
Old 02-29-2016, 09:51 AM   #9
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,651
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi DivingDuck,

I am still not sure I want to do an extra regular expression removal of possible hyphenation and if so spellcheck that word effectively twice (just to get that suggestion first) when the underlying code will find and suggest the non-hyphenated version just fine. This really is a feature request for hunspell's suggestion mechanism and not Sigil.

KevinH
KevinH is offline   Reply With Quote
Old 03-01-2016, 02:59 AM   #10
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,583
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by BetterRed View Post
  • add a button (and/or word list context menu item) to clear ignored words, there's an unassigned keyboard shortcut to do it, but... too hard to remember them all
@BetterRed - try clicking the Refresh button?

Oh no - that's far too easy - doh

Maybe theducks will lend me his cat to hide under.

BR

Last edited by BetterRed; 03-01-2016 at 03:08 AM.
BetterRed is offline   Reply With Quote
Old 03-01-2016, 12:38 PM   #11
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,812
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by BetterRed View Post
@BetterRed - try clicking the Refresh button?

Oh no - that's far too easy - doh

Maybe theducks will lend me his cat to hide under.

BR
Which one ya wanna borrow? I got 4.
3 love stomping on Keyboards or sitting between User and Monitors
theducks is offline   Reply With Quote
Old 03-01-2016, 05:53 PM   #12
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,583
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by theducks View Post
Which one ya wanna borrow? I got 4.
3 love stomping on Keyboards or sitting between User and Monitors
Maybe I need to borrow all 4 of 'em

Quote:
Originally Posted by Sigil User Guide 0.7.2

To Ignore words, click Ignore. The Misspelled Word column will update to "No" only if you have enabled the dictionary to be used in Preferences. To clear all ignored words use Tools→Spellcheck→Clear Ignored Words. Ignored words are also cleared when you open a new book -
The menu item and the keyboard shortcut don't 'function' when the spellcheck word list dialogue is active, but the Reset button does - that's fine, makes sense too.

BR

Last edited by BetterRed; 03-01-2016 at 07:47 PM.
BetterRed is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Sigil 0.5.3 - spellchecker missing? Naloomi Sigil 3 08-08-2012 09:03 PM
Book jacket enhancements in 0.7.19 GRiker Calibre 7 09-20-2010 09:19 PM
How to apply the enhancements/patches ? nubbol Calibre 2 09-04-2010 11:42 PM
Enhancements in progress??? crutledge Sigil 5 06-15-2010 02:14 PM
Am I Missing Something? (spellchecker) Guns4Hire Sigil 11 01-10-2010 06:57 AM


All times are GMT -4. The time now is 07:58 AM.


MobileRead.com is a privately owned, operated and funded community.