View Single Post
Old 05-24-2013, 03:56 PM   #430
markom
Banned
markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.
 
Posts: 488
Karma: 1080260
Join Date: Sep 2012
Device: sony prs t1 kindle dx ipad
Quote:
Originally Posted by kundor View Post
I'm using k2pdfopt to convert a large mathematical text. On the Tesseract download page, I noticed a file "tesseract-ocr-3.02.equ.tar.gz" which says it's a "Math / equation detection module for Tesseract 3.02." This sounds like it would help to OCR the math part correctly. The majority of the text is English. Is there some way to get the OCR engine to use this, in combination with the English training data?
Have you tried out kindlepdfviewer already? it reads djvu and allows fit-to-document-width(hight), fit-to-content-width(hight) in portraite and landscape and two-point cropping.

Reflow also.

https://www.mobileread.com/forums/sho....php?p=2466450

You can also convert djvu to pdf image and then after k2pdfopt use Abbyy Finereader, Acrobat etc. for OCR-ing that k2pdfopt pdf image (in text under image mode).

OCR-ing should take about hour for detailed or half an hour for quick ocr-ing of an average book.

https://www.mobileread.com/forums/sho...&postcount=413

Last edited by markom; 05-24-2013 at 04:57 PM.
markom is offline   Reply With Quote