Quote:
Originally Posted by gg4u
Hi Willus,
I tried to install tessdata v.3.05 from:
https://github.com/tesseract-ocr/tessdata
It works, processing now , I ll check result when finish but at least it is working.
Could you tell which files I need to keep to process eng language?
Would you consider to update to tesseract v.4.0 ?
I looked at git repos for k2pdfopt but:
- could not compile for I miss header file: k2pdfopt.h
- I don't much C neither tesseract to make modification to your wrapper :/
|
I am hoping to eventually compile w/Tesseract 4.0.0. It was just officially released only three weeks ago (Oct 29, 2018). I don't recommend trying to build k2pdfopt yourself unless you are pretty adventurous. It has a lot of dependencies.
For Tesseract 3.0.5, you need these files in your data folder:
eng.cube.params
eng.cube.nn
eng.cube.bigrams
eng.cube.lm
eng.tesseract_cube.nn
eng.cube.word-freq
eng.cube.size
eng.cube.fold
eng.traineddata