View Single Post
Old 10-01-2014, 04:11 AM   #24
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
Quote:
Originally Posted by Ghitulescu View Post
Again my friend you're talking about English and its 26 letters. Have you ever, to give an example, tried to scan Polish or Hungarian books? I am sure no. And even there there are errors that need human proofing, like I and l (capital i and small L). I know there are programs that can learn the characters/glyphs but still have the English rules (there are languages where "i" is written as such, for instance).
If you take the time to let the OCR program learn, the quality goes up big time. If I have a new OCR program, I usually put it in learn mode for at least 3-5 pages for each book. After about 10 books, it has learned enough and the number of OCR errors are few. For diacritics it is of course important that not only the scan is of reasonable quality, but it is also very dependent on the source. That is why I scan at 400 dpi (we have diacritics, but not so much). My program that Hitch is talking about will help you catch a lot of OCR errors, regardless of language. You can easily add your own S/R actions for common OCR errors for that procedure. There are much more procedures and checks in the tool to help you more.
Toxaris is offline   Reply With Quote