View Single Post
Old 05-11-2013, 07:20 PM   #422
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 583
Karma: 2526455
Join Date: Jun 2011
Location: California
Device: Kindle 2, iPad
Quote:
Originally Posted by kundor View Post
I'm using k2pdfopt to convert a large mathematical text. On the Tesseract download page, I noticed a file "tesseract-ocr-3.02.equ.tar.gz" which says it's a "Math / equation detection module for Tesseract 3.02." This sounds like it would help to OCR the math part correctly. The majority of the text is English. Is there some way to get the OCR engine to use this, in combination with the English training data?
I have no idea on this one--I suppose I need to do some homework on Tesseract and if there is a way to use multiple training files. You might try using the math training file and just see what you get.
willus is offline   Reply With Quote