Quote:
Originally Posted by chaley
false positives ... removing edges between nodes that are known not to be duplicates.
|
It's worth considering how a duplicate finder is likely to be used. Will it be used only to find and permanently merge or eliminate duplicates? Or will it also be used as some sort of pseudo search extension.
If the search functions for duplicates include soundex functionality (similar sounding names - fuzzy matching) that isn't implemented in the search bar, we may want to be able to disable the false positive removal, or implement the duplicate finding functions in the search bar.
I know that at some point I'm going to find a group of near duplicates that I don't want to merge and do want to eliminate from further duplicate searches, but which I later want to find as a group simply because I remember I found that group once before and I want to see it again.