View Single Post
Old 08-21-2009, 09:42 PM   #87
dvs0826
Member
dvs0826 began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Aug 2009
Device: none
Quote:
Originally Posted by kovidgoyal View Post
@dvs0826

1) calibre's folder structure is not flat. Have you actually used calibre?
You're right, it's not flat. It turns out all of my "ebooks" are actually papers from academic journals, which pretty much always have different authors. So for my purposes it's flat. So I concede the statement that "it uses a flat directory structure" and instead just say that it uses a flat directory structure for me. If I'm still wrong then please advise.

Quote:
2) Storing links in a database is extremely non-robust. Already with even the current scheme I get endless bug reports from people that mess with the file names in the folders and the calibre loses track of the files. I can only imagine how many more such bug reports I'd have to deal with if I let people keep their calibre libraries distributed all over the place. And there's there's the problem of different file systems having different file name and case conventions on different OSes, in different phases off the moon.
I don't think storing links in a database is bad as you make it sound. It's very easy to check, when loading a database, if a file exists. Or perhaps for performance scalability of larger databases defer the checking until the file is being opened. Display a message box "The file <path> could not be located. It is possible the file was moved via an external mechanism. Would you like to [delete] or [re-locate] this document?" Clicking delete removes all entries from the database, re-locate opens a file browser dialog and once a file is selected, it automatically receives all the previously stored metadata for free. Very simple.

As a side benefit of this approach, all of calibre's performance problems would be solved. It's better under 1.6.x, but it's still pretty slow. It took me 1 hour earlier to import a database of about 2,000 academic papers (about 9GB). It would have taken a few minutes tops if using the database method I proposed. calibre is also currently occupying 600MB of memory on my computer, no doubt due its database format. This would be solved as well, as it would only be necessary to maintain information about the items currently displayed on the screen, and perhaps an additional page or two for the purposes of caching and fast scrolling (I'm sure this is solvable as well under your current architecture, but you get it for free with a database).

Different filesystems having different case conventions is hardly a problem. All operating systems provide mechanisms to obtain the name of the file in the same case that the filename is stored in, so the database will simply store filenames in the correct case, and will not be modified directly by the user, only by calibre. If the user does modify the database by hand, that obviously cannot be supported. The only thing you will have to deal with is the slash character. Luckily / is an invalid filename character on all operating systems I'm aware of, so you use that for the separator.


Quote:
4) Not having enough disk space. Are you serious? In this day and age of 1TB hard drives?
It's not that surprising. Rarely will you find an office pc with a 1TB hard drive. At my company I don't even get development boxes with more than 200-300GB, I'd be surprised if a non-technical office person got more than 100GB.

Last edited by dvs0826; 08-21-2009 at 09:47 PM.
dvs0826 is offline   Reply With Quote