MobileRead Forums - View Single Post - looking for privacy friendly quality ereader

Marinolino · 06-05-2020, 12:43 AM

Quote:

Originally Posted by ottischwenk

A scan always results in an image and an image never contains any text that can be further processed.
You can only try to extract this with OCR.

We both have mentioned pdf scans, not just scanned jpeg, bmp etc. image files.

Pdf scans before OCR application are indeed non-searchable (and non-highlightable) scans, but after OCR has been applied thereon they become searchable pdf scans, if OCR layer has been saved within pdf scan.

Although nowadays we can search and highlight even non-searchable pdf scans e.g. using iOS/Android apps that would automatically apply OCR on the currently opened pdf page as we read it (without saving it to pdf thereafter), or we can search a full folder of non-searchable pdf scans in this way without opening any file.