View Single Post
Old 04-28-2011, 12:55 PM   #181
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,732
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Ok, today's pop quiz question - who can offer me an efficient file comparison algorithm?

I've tried a first pass of finding books with the same size, and then a second pass using the sha256 hash. However this has two problems - (a) it is still pretty darn slow for large libraries (around 4.5 minutes to scan a 40,000 book library with a fair few formats), and (b) after all that it still isn't "accurate" enough, returning a bunch of duplicates which really aren't, they just "hash" together.

Suggestions on a postcard please
kiwidude is offline   Reply With Quote