Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 08-25-2011, 10:50 PM   #1
jlutes
Connoisseur
jlutes began at the beginning.
 
Posts: 52
Karma: 12
Join Date: Jul 2011
Device: none
find duplicates?

Does someone know of an plugin or a procedure to find and replace duplicate tags, publishers, authors, and series? I have a number of instances where I find the same publisher duplicated a dozen times with just very small changes. For instance, ORiley, O'Riley, O' Riley, Inc., O' Riley Publishing, etc. Same with tags, authors and even series.
jlutes is offline   Reply With Quote
Old 08-25-2011, 11:08 PM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,041
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by jlutes View Post
Does someone know of an plugin or a procedure to find and replace duplicate tags, publishers, authors, and series? I have a number of instances where I find the same publisher duplicated a dozen times with just very small changes. For instance, ORiley, O'Riley, O' Riley, Inc., O' Riley Publishing, etc. Same with tags, authors and even series.
In the tag browser; Right-click: Rename (to the preferred)
Or
select all (Shift. mark additional with green plus) variants in the Tag Browser.

Select all (Ctrl-a) Edit Metadata, set the item to the preferred choice:
theducks is offline   Reply With Quote
Advert
Old 08-26-2011, 09:58 AM   #3
jlutes
Connoisseur
jlutes began at the beginning.
 
Posts: 52
Karma: 12
Join Date: Jul 2011
Device: none
theducks,
Thanks, but I was hoping for something a bit faster than doing them by hand. I have over 10k authors, 25k tags, etc.
jlutes is offline   Reply With Quote
Old 08-26-2011, 10:02 AM   #4
pchrist7
Addict
pchrist7 is kind to children and small, furry animalspchrist7 is kind to children and small, furry animalspchrist7 is kind to children and small, furry animalspchrist7 is kind to children and small, furry animalspchrist7 is kind to children and small, furry animalspchrist7 is kind to children and small, furry animalspchrist7 is kind to children and small, furry animalspchrist7 is kind to children and small, furry animalspchrist7 is kind to children and small, furry animalspchrist7 is kind to children and small, furry animalspchrist7 is kind to children and small, furry animals
 
pchrist7's Avatar
 
Posts: 385
Karma: 6514
Join Date: Aug 2010
Location: Denmark
Device: Kindle 3 3G+Wifi, Oasis
Quote:
Originally Posted by jlutes View Post
Does someone know of an plugin or a procedure to find and replace duplicate tags, publishers, authors, and series? I have a number of instances where I find the same publisher duplicated a dozen times with just very small changes. For instance, ORiley, O'Riley, O' Riley, Inc., O' Riley Publishing, etc. Same with tags, authors and even series.
Have you looked at this plugin ? Might not do all you need, but at least some of it,
https://www.mobileread.com/forums/sho...52&postcount=1

Brgds, Per
pchrist7 is offline   Reply With Quote
Old 08-26-2011, 10:12 AM   #5
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,041
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by jlutes View Post
theducks,
Thanks, but I was hoping for something a bit faster than doing them by hand. I have over 10k authors, 25k tags, etc.
That is a lot of clean up to do

How does this magic tool decide which (similar names) can be combined and which should not?
Example from my test lib: Authors
the first 3 are all the same publisher. the next one is only similar in name.

then there is the Berkley (grouping). When to combine imprints, and leave as is ?

Or for Authors : Do you combine all those with the same surname?
Attached Thumbnails
Click image for larger version

Name:	Screenshot-calibre - || My eBooks ||.png
Views:	390
Size:	215.2 KB
ID:	75785  
theducks is offline   Reply With Quote
Advert
Old 08-26-2011, 12:32 PM   #6
jlutes
Connoisseur
jlutes began at the beginning.
 
Posts: 52
Karma: 12
Join Date: Jul 2011
Device: none
The ultimate magic tool would...

Author: I thought about perhaps matching a controllable number of characters in the sirname and grouping but to simplify, it would probably need to just match the sirname. There will be lots of false positives but faster than searching through a monster list.

Publisher - Match first word - display any non-matches. In your example, the first 3 would come up as matches, all 3 Ballantine, Both Bantam, etc. Most of the non-conformity I see in publishers is someone truncating their name in metadata or the company expanding it later.

Forget I mentioned series - I only have about 100 of those - it'll be faster to do it by hand probably.

Working with the Manage authors, tags, etc. is ok for some tasks but seems very limited
jlutes is offline   Reply With Quote
Old 08-26-2011, 02:52 PM   #7
speakingtohe
Wizard
speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.
 
Posts: 4,812
Karma: 26912940
Join Date: Apr 2010
Device: sony PRS-T1 and T3, Kobo Mini and Aura HD, Tablet
The quality check plugin has some useful functions for authors (checks for reversed author sort and other things).

Also if your authors are sorted ln,fn it is easy to find discrepancies in the tag browser.

I have never cared much about publishers but I guess neatness is your concern.

I have only 3000 authors but 1800 series. I manually checked them a while ago and there were under 1000, but I have been updating metadata on books with no comments and I guess they crept back in

When manually checking them I found that a series could have 5 different but similiar names.
XXX, The XXX Mysteries, An XXX Mystery, XXX Mysteries, An XXX Mystery Story.

Oh well someday a plugin will come

Helen
speakingtohe is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[GUI Plugin] Find Duplicates kiwidude Plugins 1124 04-18-2025 09:19 AM
.... and again duplicates .... jekkii Calibre 4 02-09-2011 08:20 AM
Duplicates pauldadams Calibre 17 05-04-2010 11:57 PM
Duplicates... jaxx6166 Sony Reader 5 07-09-2009 09:13 PM
duplicates in database RJA Calibre 3 06-22-2009 09:06 AM


All times are GMT -4. The time now is 10:40 AM.


MobileRead.com is a privately owned, operated and funded community.