View Single Post
Old 11-18-2018, 02:30 PM   #1621
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by gg4u View Post
Hi Willus,

I tried to install tessdata v.3.05 from:

https://github.com/tesseract-ocr/tessdata

It works, processing now , I ll check result when finish but at least it is working.

Could you tell which files I need to keep to process eng language?

Would you consider to update to tesseract v.4.0 ?

I looked at git repos for k2pdfopt but:
- could not compile for I miss header file: k2pdfopt.h
- I don't much C neither tesseract to make modification to your wrapper :/
I am hoping to eventually compile w/Tesseract 4.0.0. It was just officially released only three weeks ago (Oct 29, 2018). I don't recommend trying to build k2pdfopt yourself unless you are pretty adventurous. It has a lot of dependencies.

For Tesseract 3.0.5, you need these files in your data folder:

eng.cube.params
eng.cube.nn
eng.cube.bigrams
eng.cube.lm
eng.tesseract_cube.nn
eng.cube.word-freq
eng.cube.size
eng.cube.fold
eng.traineddata
willus is offline   Reply With Quote