MobileRead Forums - View Single Post

luqmaninbmore · 09-30-2009, 06:50 PM

Quote:

Originally Posted by Moejoe

That's actually a good idea because ABBYY can scan a PDF into HTML/TXT/RTF etc. So if you have a sheetfeeder the above suggestion is sound. Whip it through the sheetfeeder, output a PDF, then run it through ABBYY at the end

Now if only I could find an OCR on 'buntu or mac that worked as well as ABBYY.

EDIT: Or find someone online whose sharing the book (a lot of what I'm scanning there's no chance of that) and download from them as above poster stated

On linux, I find that tesseract OCR works pretty well, provided that your using TIF files as input and the resolution is high/low enough (for some old yellow paper backs, a lower resolution results in better output).

Luqman