MobileRead Forums - View Single Post

Keroberos · 04-03-2012, 09:47 PM

Quote:

Suggestions as to better software or changes to workflow are quite welcome. I'm starting on my second project very soon.

I would definitely recommend switching OCR software (Acrobat's OCR sucks). I use ABBYY FineReader Professional--$170, but worth every penny in my opinion (with training, I don't think I spend more than an hour or two spell checking). They have a cheaper express version for $50, but I don't know how good it is. There are free OCR programs out there, can't say how good or user friendly they are (I tried Tesseract with a GUI front-end but gave up).

For scanning, I use digital camera based rigs like those described here, one for hardcovers and one for paperbacks and small hardcovers. I then batch crop the images with JPEGCrops, then process the images with Scan Tailor, OCR with Finereader, export the text as html, clean all the junk code that FineReader can add (and I'm sure Acrobat does too) with Toxaris's excellent Word macro. Then I format the cleaned html into an epub with Sigil.