MobileRead Forums - View Single Post

noisy · 06-24-2013, 01:59 AM

There are a lot of documents in PDF formats, which contains scans of very old documents. Part of them also contains OCR layer, like in this document: http://polona.pl/archive_prod?uid=1095122&cid=1095117

I have tried convert it to mobi in Calibre, however I got mobi file only with scans, without any text, which can be get from ocr layer.

Is there any way to pull out ocr text from this PDF and convert only this text to mobi?

06-24-2013, 01:59 AM	#1
noisy Member Posts: 22 Karma: 12 Join Date: Oct 2011 Device: kindle 3	PDF with OCR to MOBI There are a lot of documents in PDF formats, which contains scans of very old documents. Part of them also contains OCR layer, like in this document: http://polona.pl/archive_prod?uid=1095122&cid=1095117 I have tried convert it to mobi in Calibre, however I got mobi file only with scans, without any text, which can be get from ocr layer. Is there any way to pull out ocr text from this PDF and convert only this text to mobi?