Quote:
Originally Posted by ottischwenk
A scan always results in an image and an image never contains any text that can be further processed.
You can only try to extract this with OCR.
|
We both have mentioned pdf scans, not just scanned jpeg, bmp etc. image files.
Pdf scans before OCR application are indeed non-searchable (and non-highlightable) scans, but after OCR has been applied thereon they become searchable pdf scans, if OCR layer has been saved within pdf scan.
Although nowadays we can search and highlight even non-searchable pdf scans e.g. using iOS/Android apps that would automatically apply OCR on the currently opened pdf page as we read it (without saving it to pdf thereafter), or we can search a full folder of non-searchable pdf scans in this way without opening any file.