Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 04-26-2010, 06:43 AM   #1
dricciardi
Junior Member
dricciardi began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Apr 2010
Device: Kindle
Find duplicated items in library

Hi all,

i have a question that i didn't find any answer in the forum or in the Faq.

I bulk imported a full ebook library of a lot of books in any formats. How can i find duplicates after import has finished?

thank you

DR
dricciardi is offline   Reply With Quote
Old 04-26-2010, 06:59 AM   #2
itimpi
Wizard
itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.
 
Posts: 4,055
Karma: 777825
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
It all depends on the state of the metadata! If the titles are OK then you can simply sort by title to get duplicates next to each other. Then you could try using the "merge" facility to get these down to a single entry.

If the metadata is not OK so that the titles are all over the place then I do not think there is any easy way of doing this.
itimpi is offline   Reply With Quote
 
Enthusiast
Old 04-26-2010, 07:01 AM   #3
dricciardi
Junior Member
dricciardi began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Apr 2010
Device: Kindle
Thank you a lot. I think a long long data entry work is waiting me :-)
dricciardi is offline   Reply With Quote
Old 04-26-2010, 10:06 AM   #4
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by dricciardi View Post
I think a long long data entry work is waiting me :-)
I find duplicate titles by opening up the metadata.db file in sqlitespy and using an sql query.

Alternatively, this will also do it:

Code:
calibre-debug -c "from calibre.library.database2 import LibraryDatabase2; db = LibraryDatabase2('/path/to/library/folder');dupes = db.conn.get('select title from books group by title having count(*) > 1;');print dupes;">dupes.txt
Copy that long line, change the path to point to your library folder and paste it into a command window (terminal window, dos box, whatever you call it) and it will produce a file in the current directory called dupes.txt with all your duplicate titles in it. It's a list of dupe titles in unicode format.
Starson17 is offline   Reply With Quote
Old 04-26-2010, 03:29 PM   #5
Sabardeyn
Guru
Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.
 
Sabardeyn's Avatar
 
Posts: 629
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
Starson17,

Quote:
Originally Posted by Starson17 View Post
Code:
calibre-debug -c "from calibre.library.database2 import LibraryDatabase2; db = LibraryDatabase2('/path/to/library/folder');dupes = db.conn.get('select title from books group by title having count(*) > 1;');print dupes;">dupes.txt
This will only find exact title matches, possibly ignoring uppercase, correct? So it cannot find any titles that are slightly different.
Sabardeyn is offline   Reply With Quote
Old 04-26-2010, 03:38 PM   #6
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by Sabardeyn View Post
This will only find exact title matches, possibly ignoring uppercase, correct? So it cannot find any titles that are slightly different.
Correct. It's just an SQL query and finds exact title matches only. I suppose you could revise the query to do some fuzzy matching (but then the line would get really long.)
Starson17 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Sent Items (0) Adjust Feedback 4 09-10-2010 05:55 PM
Duplicated in Calibre mdibella Calibre 7 09-01-2010 09:24 AM
Duplicated Folders Mick4545 Calibre 2 05-24-2010 12:39 PM
Anyone find Reader Library 3.1 to be slower than ****? sayhello Sony Reader 5 12-30-2009 11:03 AM
PRS-300 Deleting Items from Library denmarks Sony Reader 1 10-15-2009 11:38 AM


All times are GMT -4. The time now is 11:04 PM.


MobileRead.com is a privately owned, operated and funded community.