Quote:
Originally Posted by montsnmags
I think it's that in the public domain books there can sometimes be so many errors from straight OCR'ing-without-checking (which is why I usually download from MobileRead - they've most often been created by people whose love extends to thorough scanning for errors. Thanks!).
I had this thought...imagine doing a simple OCR and upload of Finnegans Wake. (Aside from verifying against another text) How would know they were errors of OCR?
Cheers,
Marc
|
When I worked for a newspaper and we published legal notices in Spanish - a language none of us spoke natively - we proofread them against the original copy end-to-front, word by word. You'd have to do the same there.
I occasionally proofread for Distributed Proofreaders, and I have to say that most of the newly scanned things have very, very few OCR/scanner errors.