kiwidude: You can use a tuple directly as a key for a dict, you dont have to convert to a string.
Try verifying that the hashes are actually the same for your different files with a different hashing tool, just to ensure there isn't a bug with the hashlib library (although I find that rather unlikely).
If they are indeed hash collisions, then you can have a final check that compares reported duplicates byte-to-byte. Since there are very few of these, it shouldn't be slow.
|