04-26-2010, 06:43 AM | #1 |
Junior Member
Posts: 2
Karma: 10
Join Date: Apr 2010
Device: Kindle
|
Find duplicated items in library
Hi all,
i have a question that i didn't find any answer in the forum or in the Faq. I bulk imported a full ebook library of a lot of books in any formats. How can i find duplicates after import has finished? thank you DR |
04-26-2010, 06:59 AM | #2 |
Wizard
Posts: 4,552
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
It all depends on the state of the metadata! If the titles are OK then you can simply sort by title to get duplicates next to each other. Then you could try using the "merge" facility to get these down to a single entry.
If the metadata is not OK so that the titles are all over the place then I do not think there is any easy way of doing this. |
Advert | |
|
04-26-2010, 07:01 AM | #3 |
Junior Member
Posts: 2
Karma: 10
Join Date: Apr 2010
Device: Kindle
|
Thank you a lot. I think a long long data entry work is waiting me :-)
|
04-26-2010, 10:06 AM | #4 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
I find duplicate titles by opening up the metadata.db file in sqlitespy and using an sql query.
Alternatively, this will also do it: Code:
calibre-debug -c "from calibre.library.database2 import LibraryDatabase2; db = LibraryDatabase2('/path/to/library/folder');dupes = db.conn.get('select title from books group by title having count(*) > 1;');print dupes;">dupes.txt |
04-26-2010, 03:29 PM | #5 |
Guru
Posts: 644
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
|
Starson17,
This will only find exact title matches, possibly ignoring uppercase, correct? So it cannot find any titles that are slightly different. |
Advert | |
|
04-26-2010, 03:38 PM | #6 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Correct. It's just an SQL query and finds exact title matches only. I suppose you could revise the query to do some fuzzy matching (but then the line would get really long.)
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Sent Items (0) | Adjust | Feedback | 4 | 09-10-2010 05:55 PM |
Duplicated in Calibre | mdibella | Calibre | 7 | 09-01-2010 09:24 AM |
Duplicated Folders | Mick4545 | Calibre | 2 | 05-24-2010 12:39 PM |
Anyone find Reader Library 3.1 to be slower than ****? | sayhello | Sony Reader | 5 | 12-30-2009 11:03 AM |
PRS-300 Deleting Items from Library | denmarks | Sony Reader | 1 | 10-15-2009 11:38 AM |