View Single Post
Old 03-12-2018, 08:19 PM   #1527
MarjaE
Guru
MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.
 
Posts: 924
Karma: 53902736
Join Date: Jun 2015
Device: multiple
I've been experimenting with different ocr tools: the built-in ocr in k2pdfopt, Elucidate, and ocrmypdf.

All these implement Tesseract. But the k2pdfopt version often misses text which the other versions convert.

Unfortunately, ocring in either Elucidate, or ocrmypdf; and then converting in either k2pdfopt, or Ghostscript; often leads to an unreadable mess.

Is there any way to ocr and convert in k2pdfopt, while getting the ocr quality of the other ones which implement Tesseract? After setting up the tessadata folder, is it just a matter of downloading from tessdata-best, instead of just tessdata?
MarjaE is offline   Reply With Quote