MobileRead Forums - View Single Post - photo pdf or clearscanned pdf can be processed quicker?

DDHarriman · 11-14-2011, 03:58 PM

Hello

You are right, clearscan has the lower impact.
Basically clearscan gives the most proximal result of if you would have writen and formatted the text/images of the pdf yourself.
The catch is that all the errors (incorrect or not recognized characters/word/phrases) will show, and one must correct them by hand - Acrobat is not the best tool to do this correcting - “proof reading” is the term for it.

The other two options are what one calls a “two layer” pdf: one layer is the original (or compressed) image and the other the text (the result of the ocr processing), thus occupying the size and putting (at least) the same pressure in the eBook reader as if you were just reading a pdf made from the scanned images.
In practice, for what your problem, doing an searchable image ocr (exact or not) on Acrobat is useless.

Best regards,

11-14-2011, 03:58 PM	#2
DDHarriman Guru Posts: 860 Karma: 4380 Join Date: Feb 2008 Location: Almada, Portugal Device: Cybook Gen3, Sony PRS 505, Kindle DXG and Samsung Galaxy Note	Hello You are right, clearscan has the lower impact. Basically clearscan gives the most proximal result of if you would have writen and formatted the text/images of the pdf yourself. The catch is that all the errors (incorrect or not recognized characters/word/phrases) will show, and one must correct them by hand - Acrobat is not the best tool to do this correcting - “proof reading” is the term for it. The other two options are what one calls a “two layer” pdf: one layer is the original (or compressed) image and the other the text (the result of the ocr processing), thus occupying the size and putting (at least) the same pressure in the eBook reader as if you were just reading a pdf made from the scanned images. In practice, for what your problem, doing an searchable image ocr (exact or not) on Acrobat is useless. Best regards, Last edited by DDHarriman; 11-14-2011 at 04:01 PM.