View Single Post
Old 04-23-2011, 01:05 PM   #126
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,786
Karma: 2209340
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Wow, a simple bit of magic like that for soundex? Very cool, thx. I guess I could use the same approach as "similar title" as the starting point (stripping subtitles, punctuation etc) and then applying the soundex to that.

The question once again becomes the permutations... currently we have this:
1. Matching ISBN only
2. Identical title, ignore author
3. Similar title, ignore author
4. Similar title, identical author
5. Similar title, similar author*
6. Ignore title, similar author*

for 5 & 6, as mentioned previously "similar author" is going to change to be more conservative to not ignore initials. We will add at least one more fuzzier author option (which for example looks at a surname plus first initial only)
7. Ignore title, fuzzy author

Now we have soundex. Does it make sense to only apply it to titles rather than author names? As presumably you have the same problems of author initials etc causing problems with the results? So maybe we add:
8. Soundex title, similar author

How does that sound?
kiwidude is offline   Reply With Quote