Quote:
Originally Posted by JawadLeLogeur
It's unclear if it can actually search the text inside the PDFs
|
It's not a trivial thing to do it well...
To do it "the bulldozer way" is of course easier.
The problem is: if you have in the order of a thousand document files, each with a text content in the order of a million characters, how long do you have to wait and how heavy are the other inefficiencies when you make the searches on this GB worth of data encapsulated in non search-friendly format?
The problem proposed is good, relevant and very interesting, if it means "which instruments do we have to "google-like" index a collection".
EDIT: the Andro Search found by desk7 seems to go in the right direction