![]() |
#1 |
Connoisseur
![]() Posts: 52
Karma: 12
Join Date: Jul 2011
Device: none
|
find duplicates?
Does someone know of an plugin or a procedure to find and replace duplicate tags, publishers, authors, and series? I have a number of instances where I find the same publisher duplicated a dozen times with just very small changes. For instance, ORiley, O'Riley, O' Riley, Inc., O' Riley Publishing, etc. Same with tags, authors and even series.
|
![]() |
![]() |
![]() |
#2 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,041
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
Or select all (Shift. mark additional with green plus) variants in the Tag Browser. Select all (Ctrl-a) Edit Metadata, set the item to the preferred choice: |
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Connoisseur
![]() Posts: 52
Karma: 12
Join Date: Jul 2011
Device: none
|
theducks,
Thanks, but I was hoping for something a bit faster than doing them by hand. I have over 10k authors, 25k tags, etc. |
![]() |
![]() |
![]() |
#4 | |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 385
Karma: 6514
Join Date: Aug 2010
Location: Denmark
Device: Kindle 3 3G+Wifi, Oasis
|
Quote:
https://www.mobileread.com/forums/sho...52&postcount=1 Brgds, Per |
|
![]() |
![]() |
![]() |
#5 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,041
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
![]() How does this magic tool decide which (similar names) can be combined and which should not? Example from my test lib: Authors the first 3 are all the same publisher. the next one is only similar in name. then there is the Berkley (grouping). When to combine imprints, and leave as is ![]() Or for Authors ![]() ![]() |
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Connoisseur
![]() Posts: 52
Karma: 12
Join Date: Jul 2011
Device: none
|
The ultimate magic tool would...
Author: I thought about perhaps matching a controllable number of characters in the sirname and grouping but to simplify, it would probably need to just match the sirname. There will be lots of false positives but faster than searching through a monster list. Publisher - Match first word - display any non-matches. In your example, the first 3 would come up as matches, all 3 Ballantine, Both Bantam, etc. Most of the non-conformity I see in publishers is someone truncating their name in metadata or the company expanding it later. Forget I mentioned series - I only have about 100 of those - it'll be faster to do it by hand probably. Working with the Manage authors, tags, etc. is ok for some tasks but seems very limited |
![]() |
![]() |
![]() |
#7 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,812
Karma: 26912940
Join Date: Apr 2010
Device: sony PRS-T1 and T3, Kobo Mini and Aura HD, Tablet
|
The quality check plugin has some useful functions for authors (checks for reversed author sort and other things).
Also if your authors are sorted ln,fn it is easy to find discrepancies in the tag browser. I have never cared much about publishers but I guess neatness is your concern. I have only 3000 authors but 1800 series. I manually checked them a while ago and there were under 1000, but I have been updating metadata on books with no comments and I guess they crept back in ![]() When manually checking them I found that a series could have 5 different but similiar names. XXX, The XXX Mysteries, An XXX Mystery, XXX Mysteries, An XXX Mystery Story. Oh well someday a plugin will come ![]() Helen |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
[GUI Plugin] Find Duplicates | kiwidude | Plugins | 1124 | 04-18-2025 09:19 AM |
.... and again duplicates .... | jekkii | Calibre | 4 | 02-09-2011 08:20 AM |
Duplicates | pauldadams | Calibre | 17 | 05-04-2010 11:57 PM |
Duplicates... | jaxx6166 | Sony Reader | 5 | 07-09-2009 09:13 PM |
duplicates in database | RJA | Calibre | 3 | 06-22-2009 09:06 AM |