Quote:
Originally Posted by BetterRed
Would be nice if you could - Sigil's hyphenated word list is a reason why I stick with it. Today I found Earls-Court and Bays-water in the one document :rolleyes
|
Currently I am using both at the same time, The Case Sensitive Search is an INCREDIBLE addition, and helps catch a given type of very hard to spot errors, and Sigil's hyphenated list can catch a completely different set of OCR errors/inconsistencies.
I am using Calibre's list to point out/narrow down the errors, and then just doing all my fixes in Sigil.
Suggestion: Another odd thing I noticed in Calibre's Spell Check List is numbers.
I believe that "words" that are completely made of numbers + periods + commas should not be included in the list at all.
I believe the way that Sigil handles it, a "word" with ANY numbers is removed. But after seeing Calibre's list, I still think it is useful if "words" with SOME numbers are still left there. For example, these can then be caught/stand out like sore thumbs:
- OCR Errors
- Legitimate Uses
- 27th
- 22nd
- 2d
- Used in older books instead of the more modern form of "nd" and "rd"
- pp. 28ff.
Seeing these in list form + the amount of times they occur in the book is extremely helpful for spotting inconsistencies.
Perhaps you can safely remove "words" that are FULLY numbers, but still keep the ones that are SOME numbers?
Perhaps it can be another toggle? Include numbers, not include numbers? (Or perhaps this would make the UI too cluttered?).
Side Note: I am currently working on digitizing 12 years of a journal (~ 2 million words). The perfect size to put Calibre's Editor through some serious testing!
Now, all we need is the fantastic Reports functionality to come over to Calibre's Editor.