MY experience with discoloured old sci fi paper backs and finereader, is that it doesn't work too well.
I'm scanning my books descructively into multipage tiff files (using home-made software) then OCR ing with finereader. My more discoloured paperbacks have almost unreadably large levels of OCR errors. Books in the same series, same font and all which aren't discoloured convert fine.
An earlier respondants recommondation of PDF should work in that you will be able to read the text even if the OCR hasn't worked well (because the original image s retained in some fashion). However, you will loose reflow of the pages, which would be a no-no for me.
One trick which may be interesting is to use adaptive thresholding. Rather than picking a level, this elevates a pixel to black (or white) if it is darker (or lighter) than the immediately surrounding pixels. I've not done this for a whole book as yet, but my tests with single pages take a page which produces rubbish in Omnireader and generates near perfect text.
Sadly, the only way I know to get this facility is with the OpenCV image processing library. There may be commercial software which incorporates this, but I don't know it.
|