Quote:
Originally Posted by Aurinko
I have some PDF's that I'm fairly certain are scanned. You can tell from the fact that the font is inconsistent, e.g. all 'e' letters don't look the same but there is slight variation (that I don't think is attributable to smoothing, scaling etc.). There are also random artifacts on the pages that are clearly graphics and not text, which further hints of being scanned. I can however highlight this apparently scanned text, which made me suspect the Sony's PDF reader software has OCR. I don't think I've yet found a document on which I couldn't highlight text on the Sony, but I'd be curious to know how the behavior is determined.
|
It's likely that the PDF has the OCR text already embedded within it as a hidden layer. Some scanned PDFs are made that way to make them searchable. I would be surprised if the Sony does the OCR itself, since that is very time and power consuming.