Why are you saving them as BMP and then TIFF? For OCR-ing purposes, JPG (85-90% quality) works just fine and takes up A LOT less space in the initial, scanning phase. I scan at 300 DPI for the pages, and 600 DPI for the covers or other graphics in the book (charts, photos, etc). Anything lower or higher than that could mess with the OCR. For instance, at 600 DPI, small imperfections are detected as commas, dots, accents, etc., and since scanning at 150 DPI takes almost as long as 300 DPI, I use 300.
Quote:
DigiBook now converts BMP to TIF so that the OCR can take place using SprintExpress.
[...]
One does text editing with a word processor of ones own choosing - NOT Abbyy Fine Reader.
|
ABBYY FineReader 9.0 Sprint is not that good. It's from 2007 or something like that. You should do the proofreading in
FineReader Pro, where you have a side-by-side view of the scanned image.
Speaking of proofreading, you say that it "spoils your enjoyment of leisure reading" but that you also read them "again and again". Then why not make an effort to read them in FineReader, at least
once. You'll enjoy reading them the second time (on the e-reader) much more, because then you won't have to stop for misspellings or words that sound funny. Make good use of the dictionary when proofreading. Don't just "load it with everything you can" because they sound right. I usually look them up on dictionary.com first. If the printed book contains misspellings, I'm correcting them bee-hatches. In the (not so distant) future I may use text-to-speech software.