Most Google .pdf files are actually image-only, and do not even include the OCR'ed text. This is ridiculous, considering they have an image-over-text version "online", and allow downloading of text-only formats, but they probably have their reasons. Meanwhile, you can view the pdf files on the DR (slow, because they're giant images), but if you want to do a full text search you'll have to run them by Adobe Professional or ABBYY's Finereader.
Theoretically, an OCR software should be able to take a .txt file and match it to an image-only pdf making it an OCR'ed pdf (without the mistakes of an automatic OCR), but I don't know any that does that.
|