Quote:
Originally Posted by hannah
In fact the ocr is very bad, a lot worse than I did already. Are there ocr software that can process pdf files ? Mine is a very old ABBYY Finereader 5.0 sprint and it only reads images (tif)
|
Newer ABBYY versions can read PDFs. Or you can get tiff images from the PDF (with pdfimages on linux, for instance). Or you could process the raw scanned images: click on "All files: HTTP" and download the *_orig_jp2.tar or similar.