Quote:
Originally Posted by cerement
I think it's because most of the converters here are starting with Gutenberg texts (or archive.org texts or Google Books), stuff that has been OCRed, then proofed a couple of times, but not to any major extent. We're just adding another layer of proofing.
|
The internet Archive text files at archive.org are
not proofed. Nor are the text files derived from google books. The OCR from both leaves an awful lot to be desired. I have just spent all day cleaning one up.