Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 04-29-2011, 07:05 AM   #1
jacr
Junior Member
jacr began at the beginning.
 
Posts: 5
Karma: 12
Join Date: Apr 2011
Device: Kindle 3
Mass delete of unpopular tags

Hi all. I've searched somewhat, but cannot seem to find an answer to this one:

I've got a large collection that I've imported a lot of social metadata for. I've ended up with around 3000 odd tags, the majority (>2500) of which are only applied to a few books. I'd like to delete all unpopular tags, say all those that occur on less than 10 books. I can't find a way to do this though the Calibre UI without clicking "delete tag" 2500 times.

Any suggestion?

Much appreciated.
jacr is offline   Reply With Quote
Old 04-29-2011, 09:16 AM   #2
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
You can use bulk metadata edit for editing the tags or doing a search & replace on the tags field. That will cut down the editing required a little, but I don't know of a way to delete all tags that are applied to less than X books.
The quality check plugin may be of interest to you, although that too will only select books based on tag count.
Manichean is offline   Reply With Quote
Advert
Old 04-29-2011, 10:58 PM   #3
jacr
Junior Member
jacr began at the beginning.
 
Posts: 5
Karma: 12
Join Date: Apr 2011
Device: Kindle 3
I found a way to do this. I did the following
1) Backed up the calibre database (metadata.db)
2) Downloaded SQLite Administrator
3) Ran the following SQL against the database:

Quote:
delete from books_tags_link
where id in
(
select id from books_tags_link
where tag in
(
select tag from books_tags_link
group by tag
having count(tag) < 30
)
)
This deleted all tags that are on less than 30 books.
jacr is offline   Reply With Quote
Old 07-21-2011, 11:06 AM   #4
Arkivaren
Member
Arkivaren has learned how to buy an e-book online
 
Arkivaren's Avatar
 
Posts: 10
Karma: 94
Join Date: Jun 2011
Location: Odense, DK
Device: Kindle
Quote:
Originally Posted by jacr View Post
I found a way to do this. I did the following
1) Backed up the calibre database (metadata.db)
2) Downloaded SQLite Administrator
3) Ran the following SQL against the database:



This deleted all tags that are on less than 30 books.

You sir are a genius.
Arkivaren is offline   Reply With Quote
Old 07-21-2011, 01:45 PM   #5
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,800
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by jacr View Post
I found a way to do this. I did the following
1) Backed up the calibre database (metadata.db)
2) Downloaded SQLite Administrator
3) Ran the following SQL against the database:



This deleted all tags that are on less than 30 books.

That deletes the tags from the books (link)
What cleans up the tags table?
theducks is offline   Reply With Quote
Advert
Old 07-21-2011, 04:30 PM   #6
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by jacr View Post
This deleted all tags that are on less than 30 books.
I weep for the poor unwanted data consigned to the bit black hole of unrequited binary love
Starson17 is offline   Reply With Quote
Old 07-21-2011, 04:54 PM   #7
Arkivaren
Member
Arkivaren has learned how to buy an e-book online
 
Arkivaren's Avatar
 
Posts: 10
Karma: 94
Join Date: Jun 2011
Location: Odense, DK
Device: Kindle
Quote:
Originally Posted by Starson17 View Post
I weep for the poor unwanted data consigned to the bit black hole of unrequited binary love
I don't hehe

My tags count went from 4500 something to approx. 350
and I only deleted tags used fewer than 6 times.

Btw I found that doing a:

Code:
SELECT COUNT(DISTINCT tag) FROM books_tags_link
prior to and after the deletion, gives a nice indication of what's been
done (for some reason SQLite Administrator doesn’t).
Arkivaren is offline   Reply With Quote
Old 07-22-2011, 09:44 AM   #8
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by Arkivaren View Post
I only deleted tags used fewer than 6 times.
So you kept the "Fiction" tag found on 1200 books and threw away the "Fiction - American Pre-Revolutionary War" tag found on only 4 books.

Personally, I'd have done it the other way - thrown away all general tags found on more than 6 books and kept the useful specific tags

I love wandering through my tag list and finding hidden bits of subject matter that some helpful librarian, author or publisher spent the time to tag for me.

Different strokes for different folks!
Starson17 is offline   Reply With Quote
Old 07-22-2011, 10:04 AM   #9
HomeInMyShoes
Grand Sorcerer
HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.
 
Posts: 19,226
Karma: 67780237
Join Date: Jul 2011
Device: none
^This is really a question of indexing exhaustivity versus term specificity. Both sets of tags (frequent and infrequent) have their uses. Now if everything has the tag of book then it is useless.

My own opinion is that removing less frequent tags completely negates the importance of the long tail.
HomeInMyShoes is offline   Reply With Quote
Old 07-22-2011, 10:48 AM   #10
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by HomeInMyShoes View Post
^This is really a question of indexing exhaustivity versus term specificity. Both sets of tags (frequent and infrequent) have their uses. Now if everything has the tag of book then it is useless.

My own opinion is that removing less frequent tags completely negates the importance of the long tail.
I love the long tail, but I understand why some want to reduce their tag list. I keep a custom genre column that serves the purpose of broad categorization.

Instead of throwing away the long tail, there were other options. They could have copied the tags into a custom tag-type column and thrown away the tail from that column. That way, a tag on 5 books might later have made it past the 6 book threshold when another book with that tag arrived and they ran the process again (copy all tags to a custom column and delete the long tail from the custom column).

Data is so hard to come by that throwing it away seems a crime.
Starson17 is offline   Reply With Quote
Old 07-27-2011, 05:21 PM   #11
Acousticvillage
Enthusiast
Acousticvillage can illuminate an eclipseAcousticvillage can illuminate an eclipseAcousticvillage can illuminate an eclipseAcousticvillage can illuminate an eclipseAcousticvillage can illuminate an eclipseAcousticvillage can illuminate an eclipseAcousticvillage can illuminate an eclipseAcousticvillage can illuminate an eclipseAcousticvillage can illuminate an eclipseAcousticvillage can illuminate an eclipseAcousticvillage can illuminate an eclipse
 
Posts: 37
Karma: 8276
Join Date: Sep 2010
Device: Kindle Paperwhite 2, iPad, Marvin 3, Mapleread
Is there any way the genius above could give me a way to import / copy all the values/data in my Genres column into the Tags column? I want to delete the tags data and substitute it for genre data so that the catalogue function will show my genres....
Acousticvillage is offline   Reply With Quote
Old 07-27-2011, 05:47 PM   #12
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by Acousticvillage View Post
Is there any way the genius above could give me a way to import / copy all the values/data in my Genres column into the Tags column? I want to delete the tags data and substitute it for genre data so that the catalogue function will show my genres....
I replied earlier to this very question in this thread. Please refrain from posting the same question multiple times.
DoctorOhh is offline   Reply With Quote
Old 07-29-2011, 07:46 AM   #13
Arkivaren
Member
Arkivaren has learned how to buy an e-book online
 
Arkivaren's Avatar
 
Posts: 10
Karma: 94
Join Date: Jun 2011
Location: Odense, DK
Device: Kindle
Quote:
Originally Posted by Starson17 View Post
I love the long tail, but I understand why some want to reduce their tag list. I keep a custom genre column that serves the purpose of broad categorization.

Instead of throwing away the long tail, there were other options. They could have copied the tags into a custom tag-type column and thrown away the tail from that column. That way, a tag on 5 books might later have made it past the 6 book threshold when another book with that tag arrived and they ran the process again (copy all tags to a custom column and delete the long tail from the custom column).

Data is so hard to come by that throwing it away seems a crime.
Ordinarily I'd agree, however, having permutation upon permutation of
misspelled/malformed tags, is not my idea of "data".

You mention librarians above…trust me, no sober librarian
would have added the tags I deleted hehe

But as you say, of course YMMV and it seems it does

Btw my approach above combined with the GoodReads plugin +
dwanthny's customization and the BISAC Subject list has worked
wonders in my library.

Last edited by Arkivaren; 07-29-2011 at 08:03 AM.
Arkivaren is offline   Reply With Quote
Old 07-29-2011, 10:28 AM   #14
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by Arkivaren View Post
Ordinarily I'd agree, however, having permutation upon permutation of
misspelled/malformed tags, is not my idea of "data".
There's a difference between fixing errors/making the tags consistent versus throwing away all data that applies to six or fewer books. As you say YMMV.
Starson17 is offline   Reply With Quote
Old 07-29-2011, 10:45 AM   #15
HomeInMyShoes
Grand Sorcerer
HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.HomeInMyShoes ought to be getting tired of karma fortunes by now.
 
Posts: 19,226
Karma: 67780237
Join Date: Jul 2011
Device: none
Quote:
Originally Posted by Arkivaren View Post
Ordinarily I'd agree, however, having permutation upon permutation of
misspelled/malformed tags, is not my idea of "data".
Completely valid, although sometimes misspelled isn't misspelled and the jargon tells us something different. Although, It's hard to discern tagger intent in many cases. It may be better to fix in my opinion than to throw away, there's no fix all occurences of a tag functionality within Calibre? I haven't used it enough to know. If not, it would be a good feature to ask for.

My own tagging is a mess sometimes. Some days I'll be a lumper and some days a splitter. Tags are an evolving beast with me.
HomeInMyShoes is offline   Reply With Quote
Reply

Tags
tags


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Is the 6th Mass Extinction already under way? kennyc Lounge 15 03-06-2011 06:41 AM
Convert/Delete Tags Automatically iridius Library Management 2 02-23-2011 09:47 AM
hiya from Mass. BladeRun44 Introduce Yourself 12 12-11-2010 03:34 PM
Delete files in PC not equal to delete in Sony reader 505 sheilalayoli Sony Reader 5 07-12-2009 03:13 PM
Hello from Western Mass Oldvjp1 Introduce Yourself 5 03-12-2009 12:46 PM


All times are GMT -4. The time now is 07:40 PM.


MobileRead.com is a privately owned, operated and funded community.