With gwynevans pointer, I modified database2.py to specify a local (non-network) drive as the location of the metadata.db file as follows:
Code:
self.dbpath = os.path.join('c:\\calibre\\db', 'metadata.db')
The location of the library itself (with all the books) and the directory I was importing from remained on a NAS drive connected via a local gigabit network. The NAS drive is RAID 5 (striping with parity; 4 physical drives) where I get 7 MB/sec sustained transfer rates. The local disk is not RAID, and is actually slower than the NAS in many scenarios (4.5 MB/sec using the same benchmark).
For both tests, I started with an empty library (no files, no metadata.db file). I used Calibre 0.5.14 on Windows XP. For the test, I imported a directory tree containing 753 ebooks (mostly .lit).
Before the change, it took 54 minutes to import the library.
After the change, it took 10 minutes to import the library.
In both cases, the vast majority of the time was spent adding the books to the database (after scanning the files and reading the metadata), so I definitely suspect sqllite performance over SMB. I also noticed that the import time does not scale linearly with the number of books (size of the database) when the metadata.db file was on the NAS. The first 20% or so (according to the progress bar) took about the same amount of time in both cases, but the NAS case was really crawling by the end. Yesterday, it took over 4 hours to import 1600 books on the NAS (I'm not sure exactly how long it took because I went to bed), again starting with an empty database.
Kovid, are you open to adding an optional config parameter to specify an alternate location for metadata.db for NAS users like us?
Thanks,
Todd Klaus