View Single Post
Old 02-01-2015, 11:35 AM   #7
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,557
Karma: 93980341
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by rumpumpel1 View Post
that's amazing: even for simple layouts and plain text there is a remaining error rate of one error per page ? What kind of errors are these ? Can they be corrected with a spell checker of a decent office program?
These are "normal" OCR errors, where the shapes of letters look similar, eg "clock" instead of "dock" ("cl" and "d" are very difficult for OCR to tell apart). A spell-checker won't help, because they are real words - just not the right word.

A decent OCR program has an accuracy rate of better than 99.9%, but a typical page has around 2000 characters on it, so that means about 2 character errors per page. Some of these the OCR program's spell-checker will fix for you, but some it will get wrong.
HarryT is offline   Reply With Quote