View Single Post
Old 07-15-2011, 07:58 PM   #108
saintly
Junior Member
saintly began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Jul 2011
Device: Kindle
@kiwidude: I love your plugin! Before another user tipped me off to it, I was using a lot of Perl scripts to manage my collection. It looks like you were way ahead of my efforts, and your plugin found hundreds of duplicates I missed.

If I may offer a suggestion;
I previously had lots of books with the series in the title. ("Doctor Who: Something or Other" / "Star Trek: Something"). In order to detect duplicates, I used this technique:
- fuzzy author match (same as yours: lastname + 1st initial)
- Split up the title on these characters: "-:;,&" and the word "and".
- Alert for a possible match if any of those pieces matched any other books
- Allow for a piece to be 'whitelisted', so that it won't trip on 'Doctor Who' all the time

That allows me to detect "Doctor Who: Something or Other" and the book "Something or Other" by the same author. Additionally, it can detect combos like:
"Nightfall's Sequel"
"Nightfall; Nightfall's Sequel; The third Nightfall Book" (an e-book that includes the text of 3 other books, a somewhat rare occurrence)
saintly is offline   Reply With Quote