View Single Post
Old 03-29-2009, 02:16 PM   #5
Grimulkan
Lord
Grimulkan has a complete set of Star Wars action figures.Grimulkan has a complete set of Star Wars action figures.Grimulkan has a complete set of Star Wars action figures.Grimulkan has a complete set of Star Wars action figures.
 
Grimulkan's Avatar
 
Posts: 177
Karma: 328
Join Date: Feb 2009
Device: Q1 (on way out), PRS505, DR1000S (dead :<), TC1100 (10'' perfection!)
Most Google .pdf files are actually image-only, and do not even include the OCR'ed text. This is ridiculous, considering they have an image-over-text version "online", and allow downloading of text-only formats, but they probably have their reasons. Meanwhile, you can view the pdf files on the DR (slow, because they're giant images), but if you want to do a full text search you'll have to run them by Adobe Professional or ABBYY's Finereader.

Theoretically, an OCR software should be able to take a .txt file and match it to an image-only pdf making it an OCR'ed pdf (without the mistakes of an automatic OCR), but I don't know any that does that.
Grimulkan is offline   Reply With Quote