MobileRead Forums - View Single Post

jonzim · 08-29-2021, 10:27 AM

This looks very close to what I need. I wrote a separate app I use on my machine that pulls in all the metadata from Calibre and stores it on Lucene so I can quickly search before importing a new book (it searches for me automatically on various parameters when I select the file in my own separate UI) --

However, sometimes the original filenames are different when importing books - an identical book might have already been imported under a different filename. I'm currently also checking filesize, but that's not guaranteed to be accurate.

If it is compatible with your intent, could you consider adding another custom column for the MD5 checksum of the original file too? It only matters the checksum at the time of import, not the current checksum. I think it would be rather trivial - maybe something like this? (i haven't tried it)

from hashlib import md5
...
filestream = file.read()
h = hashlib.md5()
h.update(filestream)
mi.md5 = h.hexdigest()

08-29-2021, 10:27 AM	#363
jonzim Junior Member Posts: 2 Karma: 10 Join Date: Aug 2021 Device: calibre	This looks very close to what I need. I wrote a separate app I use on my machine that pulls in all the metadata from Calibre and stores it on Lucene so I can quickly search before importing a new book (it searches for me automatically on various parameters when I select the file in my own separate UI) -- However, sometimes the original filenames are different when importing books - an identical book might have already been imported under a different filename. I'm currently also checking filesize, but that's not guaranteed to be accurate. If it is compatible with your intent, could you consider adding another custom column for the MD5 checksum of the original file too? It only matters the checksum at the time of import, not the current checksum. I think it would be rather trivial - maybe something like this? (i haven't tried it) from hashlib import md5 ... filestream = file.read() h = hashlib.md5() h.update(filestream) mi.md5 = h.hexdigest()