Quote:
Originally Posted by Tanjamuse
Is it possible to avoid the duplicate check when adding books to empty library?
It only checks the title and not both title and author or any other columns.
Could it be possible to either override this duplicate check or make sure that it compares more than one column to figure out if it's a duplicate.
|
I just set the ignore via command line. Not sure how to do the same in GUI.
I use the Find Duplicates plugin to check for duplicates after import. I've got plenty of fanfics that have changed author pseudonym or title so the url identifier (e.g.
https://archiveofourown.org/works/1234567) is the most reliable method of catching duplicates.
I was actually quite curious about performance so I ran some import tests using part of my AO3 fanfic library.
SSD used is a 500GB Samsung 840 with planar TLC NAND. It's 5-6 years old and 90% full so probably quite slow by modern SSD standards as well as due to normal performance degradation.
HDD used is a brand new, empty 1TB 7200RPM Seagate Barracuda (found it in my box of spare parts).
Flash drive used is a 128GB Samsung Bar USB 3.1 (connected to USB 3.0 port).
Code:
calibredb add --duplicates --recurse --library-path "X:\Calibre Portable\TestLibrary"
"X:\ebooks\import"
Import Stats
mm:ss.00 MB/min books/min
SSD to SSD 13:54.10 178 355
SSD to HDD 16:55.52 147 291
HDD to HDD 20:19.20 122 243
Flash to HDD 17:37.79 141 280
Import Structure:
\Fandom\Authors\Authors - Title (id).ext
2.422 GB, 979 folders, 24,650 files
4,930 "unique" books (based on checksum)
* each book has epub, mobi, txt, opf & cover
2,411 unique titles (based on url identifier)
Options:
-d, --duplicates Add books to database even if they already exist.
Comparison is done based on book titles.
-r, --recurse Process directories recursively