View Single Post
Old 12-10-2010, 10:45 AM   #9
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by kiwidude View Post
I ended up writing my own external tool to query the database and find duplicates based on quite a range of criteria. It is fairly "fuzzy" in that it strips off stuff like lead "A " and "The ", rips out characters like colons, apostrophes etc and pumps out various sets of results.
This is built into find_identical_books

Quote:
It also does "starts with" type checks, as there could be the same book but a longer version of the title as many books often have.
This is a good point. I've seen a few of this type of dupe that weren't caught with my tools.

Quote:
you can get a problem with wasting your time re-verifying that exception every time you run the duplicate check, particularly when you have lots of books.
Like you, I built my own dupe checker, and like you, I found myself rechecking the same exceptions a lot. One of the reasons I posted was to highlight the same issue you have highlighted - what you want or need the dupe checker to do seems to change as you use it. I found myself changing the search a lot to look for dupes in different ways and spending too much time looking at the exceptions. For a while I had a custom boolean column that meant "If all dupes found for this title have this column checked, we are not dupes of each other"
Starson17 is offline   Reply With Quote