Quote:
Originally Posted by ricdiogo
Indeed. The *.html versions are mainly done these days by Distributed Proofreaders' post-processors when they produce the basic *.txt version.
|
Digital Proofreader's doing a great job by the way. Not a single OCR mistake thanks to this system compared to older materials.
Another great idea for OCR:
http://recaptcha.net/
I'll add it on Feedbooks this month.