Quote:
Originally Posted by Xenophon
If you want to learn more than you knew there was to learn about OCR and OCR errors, go hang out on the Distributed Proofreaders fora.
|
Nah, that's okay... I've done enough of both to already know most of it. And that's why I say it's not that bad.
Over the years, I developed a 2-step process that eliminated most of the errors from scan-and-OCR:
1. Photocopy the pages
enlarged to letter-sized pages;
2. Run the enlarged photocopies through high-quality auto-feed printer scanner (like those designed for high-speed network printers like Xerox DocuTech systems) for OCR.
The enlarged pages not only ran better through standardized scan feeders, but the larger letters provided more accurate OCR. It could even speed up the process, in many cases, depending on the equipment you used.
Then you check the copy, and hire a proofreader to check behind you.
In using this method, I've had excellent quality and acuracy out of OCR projects. I repeat: It's not rocket science.
Obviously, you still have to pay your workers, and the better your wages, the better quality of their work. Quality and speed also depend on the speed and quality of the equipment used (and the corresponding cost). But this system works beautifully for most any OCR project.