View Single Post
Old 10-13-2018, 05:19 PM   #14
ilovejedd
hopeless n00b
ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.
 
ilovejedd's Avatar
 
Posts: 5,110
Karma: 19597086
Join Date: Jan 2009
Location: in the middle of nowhere
Device: PW4, PW3, Libra H2O, iPad 10.5, iPad 11, iPad 12.9
Quote:
Originally Posted by Tanjamuse View Post
Is it possible to avoid the duplicate check when adding books to empty library?

It only checks the title and not both title and author or any other columns.

Could it be possible to either override this duplicate check or make sure that it compares more than one column to figure out if it's a duplicate.
I just set the ignore via command line. Not sure how to do the same in GUI.

I use the Find Duplicates plugin to check for duplicates after import. I've got plenty of fanfics that have changed author pseudonym or title so the url identifier (e.g. https://archiveofourown.org/works/1234567) is the most reliable method of catching duplicates.

I was actually quite curious about performance so I ran some import tests using part of my AO3 fanfic library.

SSD used is a 500GB Samsung 840 with planar TLC NAND. It's 5-6 years old and 90% full so probably quite slow by modern SSD standards as well as due to normal performance degradation.

HDD used is a brand new, empty 1TB 7200RPM Seagate Barracuda (found it in my box of spare parts).

Flash drive used is a 128GB Samsung Bar USB 3.1 (connected to USB 3.0 port).


Code:
calibredb add --duplicates --recurse --library-path "X:\Calibre Portable\TestLibrary"
"X:\ebooks\import"


Import Stats
               mm:ss.00  MB/min  books/min
SSD to SSD     13:54.10    178      355
SSD to HDD     16:55.52    147      291
HDD to HDD     20:19.20    122      243
Flash to HDD   17:37.79    141      280


Import Structure:
\Fandom\Authors\Authors - Title (id).ext

2.422 GB, 979 folders, 24,650 files

4,930 "unique" books (based on checksum)
* each book has epub, mobi, txt, opf & cover

2,411 unique titles (based on url identifier)


Options:
  -d, --duplicates      Add books to database even if they already exist.
                        Comparison is done based on book titles.

  -r, --recurse         Process directories recursively
ilovejedd is offline   Reply With Quote