the idea is not bad. The only catch is how to reliably identify the same 3 words in all formats of the book. Is it fair to have a small dictionary of words that we use? I think this would be an interesting project and would be fun to see how large and accurate a database we could build.
It would also save people potentially thousands of man hours of organization if we could get this widely into use.
The key is even if we don't get a 1-1 match for all editions it should not matter eventually you will have fingerprints for the other editions and those would be tagged as well.
I am still looking for volunteers.
|