Quote:
Originally Posted by kiwidude
I ended up writing my own external tool to query the database and find duplicates based on quite a range of criteria. It is fairly "fuzzy" in that it strips off stuff like lead "A " and "The ", rips out characters like colons, apostrophes etc and pumps out various sets of results.
|
This is built into find_identical_books
Quote:
It also does "starts with" type checks, as there could be the same book but a longer version of the title as many books often have.
|
This is a good point. I've seen a few of this type of dupe that weren't caught with my tools.
Quote:
you can get a problem with wasting your time re-verifying that exception every time you run the duplicate check, particularly when you have lots of books.
|
Like you, I built my own dupe checker, and like you, I found myself rechecking the same exceptions a lot. One of the reasons I posted was to highlight the same issue you have highlighted - what you want or need the dupe checker to do seems to change as you use it. I found myself changing the search a lot to look for dupes in different ways and spending too much time looking at the exceptions. For a while I had a custom boolean column that meant "If all dupes found for this title have this column checked, we are not dupes of each other"