Quote:
Originally Posted by obvious
Would it be feasible to add an option to find duplicates by content? CPU/IO heavy no doubt but what do you think?
Just a thought 
|
I've been trying to use the "count words" plugin instead, in order to identify different formats with the same content.
If the content is the same, the word count should be the same, right?.... err... wrong

. My results with that plugin were inconclusive for the moment (e.g. books with supposedly same content (such us donwloaded from the same source like guttenberg, in different formats) actually had a way different words count. Still testing to see what's wrong.