Quote:
Suggestions as to better software or changes to workflow are quite welcome. I'm starting on my second project very soon.
|
I would definitely recommend switching OCR software (Acrobat's OCR sucks). I use
ABBYY FineReader Professional--$170, but worth every penny in my opinion (with training, I don't think I spend more than an hour or two spell checking). They have a cheaper express version for $50, but I don't know how good it is. There are free OCR programs out there, can't say how good or user friendly they are (I tried Tesseract with a GUI front-end but gave up).
For scanning, I use digital camera based rigs like those described
here, one for hardcovers and one for paperbacks and small hardcovers. I then batch crop the images with
JPEGCrops, then process the images with
Scan Tailor, OCR with Finereader, export the text as html, clean all the junk code that FineReader can add (and I'm sure Acrobat does too) with Toxaris's excellent
Word macro. Then I format the cleaned html into an epub with Sigil.