03-01-2013, 05:05 PM | #1 |
Enthusiast
Posts: 39
Karma: 10
Join Date: Jan 2009
Location: South Pacific
Device: Kindle DX
|
Pruning redundant and partially redundant tags
I've been using the goodreads metadata download plugin to map tags to a hierarchy I like, and now I'd like to prune some of the redundant information out of the rest of my tags. So I'm looking for an elegant solution. I'm getting there with the regex replacement, but as stated, any more elegant solutions would be appreciated. I'm a little stumped on how to search for books that have redundant info on something better than a case-by-case basis.
example: foo.fie, foo.fie.fum, foo, fum fie would become simply: foo.fie.fum or fiction, genre.crime, genre.mystery, genre.mystery.hard-boiled, crime, mystery, mystery & detective, hardboiled mystery would become genre.crime, genre.mystery.hardboiled my regex is similar to this, though I've got a bit of a mishmash going with special cases: template {tags} (\.[^\.,]+)(.*, )?([^,\.]*)\1; \1\2 I have to use separate search terms if the offending tags sort alphabetically before the genre tags Any ideas on how to search for or otherwise identify books with partially redundant tags? Maybe a calculated column? How about some cleaner more robust replacement terms? One other thing that bugs me is when I get info like the author's name or publisher mixed in as a tag when I've already got that information in it's appropriate column. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Redundant topic line | Steven630 | Recipes | 6 | 06-22-2012 12:43 PM |
bad / redundant html ? | cybmole | Calibre | 0 | 12-29-2010 11:49 AM |
Redundant/Invalid TOC entries | Stinger | Kobo Reader | 4 | 06-26-2010 09:02 PM |
Not to be obnoxiously redundant but can we have a jetBook forum? | wodin | Feedback | 7 | 05-25-2009 03:41 PM |
Redundant collections after using calibre | Yarrow | Calibre | 0 | 12-25-2008 04:30 PM |