Quote:
Originally Posted by kiwidude
(3) Find Duplicates - this is a real problem for the binary comparison. Copying files just so you can compute a hash on them? Performance would go to the toilet - remember this is comparing every single format of every book in your library, and it makes use of the file timestamps to decide whether to recompare. Welcome any suggestions on this.
|
I suggest that computing the hash be added to the API. No need to copy. As FD is eventually to go to calibre core, this may be the right approach.
Another suggestion: return a pipe instead of a temp file. The file must be read anyway, so reading a pipe shouldn't be more costly then reading the file. DB2 equivalent would need to handle filling the pipe without blocking, but there are several ways to do that. The easiest is to use a thread.
4) The pipe solution would work here as well.