I've been scanning an old book that's out of print using ABBYY FineReader 9's scanner and I've had really good success with it. Any highlighted possible errors are usually correct, I'd guess about 98% correct.
I did some experimenting with how to do the scans before I got to that sweet place though, and the best settings I found are using ABBYY's scan interface, scanning in grayscale at 600 dpi (the book is pretty small print), manual brightness adjustment to get the background whiter (you'll have to play with that per book), and the following settings checked under Options > Scan/Open:
Correct image skew
Detect page orientation (not needed in this book's instance)
Split dual pages
Convert color and grayscale images to black-and-white <---- this setting I believe is what really turned the tables and gave me MUCH crisper cleaner darker images that resulted in the much higher accuracy of the OCR readings. Before this change the background of pages often turned out grayish (it's very old yellowed pages in the case of my book, which made that problem even worse).
I then save each scanned page as a PNG Black and White image, which has given me the sharpest text. Just in case the OCR reading is messed up, or I lose a document, or lose everything to a power outage, etc., I have the images I can reload again if need be and not have to redo the tedious scanning again.
I use fast reading and don't use any patterns. The OCR reads are really fast and accurate now. When I first started with default settings, the number of errors was unbearable to keep editing.
Last edited by Ripplinger; 03-17-2011 at 01:55 AM.
|