Quote:
Originally Posted by automa
Editing the spelling errors would be understandable because only humans can do that while the other two I mentioned probably can be automated via programming.
|
Here is a popular program used to pre-process optical character recognition long text output before proofreading:
http://home.comcast.net/~thundergnat/guiprep.html
As you can see from my link, an enormous amount of work has already gone into this.
You may have good ideas for additional features. However, it is was easy, or even middling difficult, I think it would already have been done. People are undoubtedly work on some of the hard stuff.
I'm not sure, but a lot of non-proofread texts at archive.org have perhaps already gone through this sort of software.