View Single Post
Old 10-03-2013, 11:14 AM   #6
At_Libitum
Addict
At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.
 
Posts: 265
Karma: 724240
Join Date: Aug 2013
Device: KyBook
I like the initiative but I think there is still room for improvement. It deemed a lot of perfectly valid spelled words a 'Possible OCR error'. I have totally no idea if this is possible but I think you could reduce this a lot by looking at the context the word was used in. Not every h is always the result of an OCR errored 'b' nor vice-versa, same for the 'l' and 'f' versus 't'.

I also found, by accident, that it seems to suffer from OCR blindness itself too. I was trying it out on "Three Men in a Boat" from Jerome K. Jerome, and there is a sentence in the ePub going as follows:

"Harris, in moving about, trod on George’s corn."

Epub spellchecker actually read the corn as "com" and also flagged it as such. AND gave the same word as suggested replacement.
(see attachment)

Also, about the unneeded hyphens, not being a native English speaker I would need to study up on when and where they are normally used but I am almost sure there are words requiring them. I have not yet looked too deep but I assume there is some kind of exception list someplace so that not everything containing hyphens is flagged as such.

EDIT: PS. It may have not had this when you started ePub checker, but the current Sigil build has a similar approach option as yours. If using the Spellcheck button you get like in ePub spellchecker, a list with deemed misspellings plus frequency counts and similarly like in ePub, you can have all occurrences replaced at once but not for every 'misspelling' at once, which may be a bit too aggressive because you always will need to revise the list to make sure it only replaces true misspellings. So in the end you are still spending the same amount of time. But that aside. Yours does offer more information in that it tries to categorize the type of spelling errors AND more importantly it shows the context.

Suggestion: Extend the Options filter to include all types of possible errors so that you can filter on each category separately instead of on "Show only errors & warnings" (and also of course have the 'Copy all suggestions ...' then only affect the filtered list)

Suggestion2: About the context preview. Would be cool if it could show the rendered version instead of the html code itself. Also there seems to be some extra useless space inserted before the bolded misspelling. Don't think you need that to accentuate the misspelling if you already bold the word.
Attached Thumbnails
Click image for larger version

Name:	funny-non-error.png
Views:	595
Size:	5.1 KB
ID:	112712  

Last edited by At_Libitum; 10-03-2013 at 12:11 PM.
At_Libitum is offline   Reply With Quote