View Single Post
Old 09-10-2010, 02:59 PM   #1
BookGnome
Voracious Reader
BookGnome is on a distinguished road
 
BookGnome's Avatar
 
Posts: 4
Karma: 62
Join Date: Sep 2010
Device: Kindle
Calibre and bit-rot

I was thinking about something today, and as far as I can tell from looking at the database structure, Calibre has no protection against bit-rot of the e-books themselves. I'm not talking about database corruption; I'm talking about filesystems with bad sectors, where books might get silently corrupted on disk.

All Calibre seems to track is the uncompressed size of the book. Size alone isn't really much of a guarantee of file integrity.

It seems to me that if it's not doing so already, Calibre ought to store a hash of the book in the database to validate that the book hasn't been corrupted on disk. An SHA-1 or MD5 hash would probably be sufficient for the purpose.

In addition, it might be wise to store some recovery bits on the filesystem (e.g. par2 files, or some other variant of Reed-Solomon encoding) in order to be able to recover from modest amounts of on-disk corruption.

Integrity of the database itself is important, but I've seen enough disks get flakey--and bought enough e-books from vendors that don't allow re-downloads--that protection from bit-rot on the filesystem is important to me.

I thought I'd post about it here and see what others thought before posting wish-list items on the bug tracker. After all, maybe Calibre is already doing something smart about this, and I just don't know about it.

Comments? Suggestions? Additional thoughts?
BookGnome is offline   Reply With Quote