MobileRead Forums - View Single Post - iRex Digital Reader works to view Google Pdfs

Grimulkan · 03-29-2009, 02:16 PM

Most Google .pdf files are actually image-only, and do not even include the OCR'ed text. This is ridiculous, considering they have an image-over-text version "online", and allow downloading of text-only formats, but they probably have their reasons. Meanwhile, you can view the pdf files on the DR (slow, because they're giant images), but if you want to do a full text search you'll have to run them by Adobe Professional or ABBYY's Finereader.

Theoretically, an OCR software should be able to take a .txt file and match it to an image-only pdf making it an OCR'ed pdf (without the mistakes of an automatic OCR), but I don't know any that does that.

03-29-2009, 02:16 PM	#5
Grimulkan Lord Posts: 177 Karma: 328 Join Date: Feb 2009 Device: Q1 (on way out), PRS505, DR1000S (dead :<), TC1100 (10'' perfection!)	Most Google .pdf files are actually image-only, and do not even include the OCR'ed text. This is ridiculous, considering they have an image-over-text version "online", and allow downloading of text-only formats, but they probably have their reasons. Meanwhile, you can view the pdf files on the DR (slow, because they're giant images), but if you want to do a full text search you'll have to run them by Adobe Professional or ABBYY's Finereader. Theoretically, an OCR software should be able to take a .txt file and match it to an image-only pdf making it an OCR'ed pdf (without the mistakes of an automatic OCR), but I don't know any that does that.