Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 08-14-2012, 07:28 AM   #1
Rob557
Connoisseur
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 50
Karma: 810
Join Date: Jul 2012
Device: Kobo
Question Title bulk edit match - to remove series info from title

Would you know if it is possible to use Calibre's bulk edit search and replace to remove the series and/or series_index from the title, specifically where the series and/or series_index is already separately identified in the appropriate fields.

For example, in Calibre's bulk edit search and replace, when editing the title field would there be a regular expression that could be used in the "search for" field to match against the contents in the series and/or series_index fields so that such character string matches could thereby be removed from the title?

The Quality Check plug-in is very helpful (check metadata / check titles with series) by identifying an overall subset of files that appear to have the series included in the title, but determining the necessary edits (in a bulk edit of further subsets of those files) is a trickier exercise. I am mainly looking at the situation of merging two libraries that used different approaches and a bulk edit approach would be more practical than editing each occurrence manually.

Thank you in advance for any help on this. I wasn't able to find any threads that established whether such an approach was possible.
Rob557 is offline   Reply With Quote
Old 08-14-2012, 07:47 AM   #2
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,846
Karma: 5654321
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by Rob557 View Post
Would you know if it is possible to use Calibre's bulk edit search and replace to remove the series and/or series_index from the title, specifically where the series and/or series_index is already separately identified in the appropriate fields.

For example, in Calibre's bulk edit search and replace, when editing the title field would there be a regular expression that could be used in the "search for" field to match against the contents in the series and/or series_index fields so that such character string matches could thereby be removed from the title?

The Quality Check plug-in is very helpful (check metadata / check titles with series) by identifying an overall subset of files that appear to have the series included in the title, but determining the necessary edits (in a bulk edit of further subsets of those files) is a trickier exercise. I am mainly looking at the situation of merging two libraries that used different approaches and a bulk edit approach would be more practical than editing each occurrence manually.

Thank you in advance for any help on this. I wasn't able to find any threads that established whether such an approach was possible.

REGEDIT has to match YOUR title string exactly
Ex: The Title [the series 3] - The Author

(.+)\s+\[.+\d+\]\s-\s(.+)

\1 - \2
theducks is online now   Reply With Quote
Old 08-14-2012, 08:43 AM   #3
Rob557
Connoisseur
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 50
Karma: 810
Join Date: Jul 2012
Device: Kobo
Quote:
Originally Posted by theducks View Post
REGEDIT has to match YOUR title string exactly
Ex: The Title [the series 3] - The Author

(.+)\s+\[.+\d+\]\s-\s(.+)

\1 - \2
Thanks theducks. I may be missing something in how to use that regular expression approach:

In the bulk edit search and replace I did the following:
search field: title
search for: (.+)\s+\[.+\d+\]\s-\s(.+)
replace with: \1 - \2
destination field: #test
your test field: Title [3] - Author
but the "test result" field came out as "Title [3] - Author" ... i.e. it did not show any changes.

To expand on my earlier comments, the following illustrates the most common examples that I am coming across for the contents of the title field:
series # - title (i.e. no brackets around the number)
series - # - title
series - title
but I'm not having much luck so far trying to modify that regular expression approach.

Last edited by Rob557; 08-14-2012 at 11:03 AM. Reason: very minor correction: test# changed to #test
Rob557 is offline   Reply With Quote
Old 08-14-2012, 09:24 AM   #4
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,846
Karma: 5654321
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by Rob557 View Post
Thanks theducks. I may be missing something in how to use that regular expression approach:

In the bulk edit search and replace I did the following:
search field: title
search for: (.+)\s+\[.+\d+\]\s-\s(.+)
replace with: \1 - \2
destination field: test#
your test field: Title [3] - Author
but the "test result" field came out as "Title [3] - Author" ... i.e. it did not show any changes.

To expand on my earlier comments, the following illustrates the most common examples that I am coming across for the contents of the title field:
series # - title (i.e. no brackets around the number)
series - # - title
series - title
but I'm not having much luck so far trying to modify that regular expression approach.
See, your pattern was Not the most common i have seen.
One size REGEG does not fit any except the EXACT one it was crafted for
You have 3 cases: (although 1 and 2 could be accomplished together with a more complex REGEX)

series # - title
.+ \d+\s-\s

series - # - title
.+\s-\s\d+\s-\s
(this may work for both this and the above:
.+\s(|-\s)\d+\s-\s
)

series - title
.+\s-\s
theducks is online now   Reply With Quote
Old 08-14-2012, 09:45 AM   #5
Rob557
Connoisseur
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 50
Karma: 810
Join Date: Jul 2012
Device: Kobo
Quote:
Originally Posted by Rob557 View Post
To expand on my earlier comments, the following illustrates the most common examples that I am coming across for the contents of the title field:
series # - title (i.e. no brackets around the number)
series - # - title
series - title
but I'm not having much luck so far trying to modify that regular expression approach.
Hi theducks, I'm a real newbie on regular expressions, and I see now what you were doing with the regular expression you outlined earlier. I had thought /s was somehow making reference to the series field, but now understand it just indicates a space, so the approach you were describing was more of a pragmatic structural approach rather than matching with the contents of the series and index_series fields.

In that context, for the three examples that I gave, it appears the following will work if I can get subsets specific to those structures:
a) "series # - title" or "series - title"
For those two examples, just use "(.+)-" in the "search for" field
b) "series - # - title"
For that example, just use "(.+)- (.+)-" in the "search for" field.
In both cases, the "replace with" field would be blank.

That seems to work well, and I guess I was unnecessarily over complicating things trying to match the contents with the series and index_series. I'm guessing that matching approach may not be very practical anyway.

Thanks theducks for pointing me in the right structural direction !
Rob557 is offline   Reply With Quote
Old 08-14-2012, 10:57 AM   #6
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,846
Karma: 5654321
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
did you win

In the Bulk S&R:
sometimes all you have to do is capture the part to be removed The nice thing with this S&R form is you get to see the results without committing a change
theducks is online now   Reply With Quote
Old 08-14-2012, 11:01 AM   #7
Rob557
Connoisseur
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 50
Karma: 810
Join Date: Jul 2012
Device: Kobo
Thank you theducks for the additional examples that you provided, I had not seen your second set of comments when I was checking out and responding to your first comments.

I also realize now from your examples that "/s-/s" is a better delineator to use rather than just the dash "-" by itself since the dash might also be used as a hyphen within the title. Cheers.
Rob557 is offline   Reply With Quote
Old 08-08-2013, 04:37 PM   #8
Merischino
Groupie
Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.
 
Merischino's Avatar
 
Posts: 183
Karma: 357868
Join Date: Jul 2010
Location: somewhere south of the mason dixon line
Device: Nexus 7 FHD (aka 2013, 2nd gen), Kindle 2, Samsung Galaxy s3
Hi. I have this same need -- after a major loss of hardware, recreating my calibre library I find myself doing a lot of the same cleanup I did manually the first time as I added all my books. Re-entering series information,r e-deleting series information from titles, and lots of metadata downloads amounting to real inefficiency of time spent.

I have no experience with regular expressions, regex, reged, and everything else mentioned in this conversation. The /s(.+) and etc really had thrown me for a loop.

Is there a place where complete noobs can figure this stuff out?

For example, in the series of Otherworld books, which has ~30 titles, some in prequel mode and some chronological additions in the midst of the original series order. The series, index, title, and author are all appearing concatenated in the title field for some reason (though the files are not named that way).

Search and replace, I can figure out how to remove the english language words but not how to remove the index items. boolean search options don't seem to work. If there is a bulk way to fix this kind of stuff (after having spent the time doing "manage series" and manual title edits on a cadre of books, dealing with title field issues is really becoming a drain.

Is there some place a body could learn how to do this stuff, given that I'm no sort of programmer and don't know what plugboards are, and half of what's written here makes no sense to me at all?

Thanks for your time,
M-

Edit: I have now read the similar post: http://www.mobileread.com/forums/sho...d.php?t=118569
And although I now have a bit more of a sense of regular expressions, I do still need help figuring out how to implement that information. eg, in a series of records that include "otherworld" space - space "index" where the value can be either a two digit index or a decimaled 3 digit index space - space "title", how would I bulk remove everything but the actual Title, since all other information is already populated in the appropriate fields for those records?

Last edited by Merischino; 08-08-2013 at 04:53 PM. Reason: appending reading history on other relevant posts
Merischino is offline   Reply With Quote
Old 08-08-2013, 06:18 PM   #9
Adoby
Handy Elephant
Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.
 
Adoby's Avatar
 
Posts: 1,105
Karma: 5168844
Join Date: Dec 2009
Location: Southern Sweden, far out in the quiet woods
Device: Ubuntu Linux, Cybook Opus, Motorola Xoom with Mantano Premium
You can match three groups of text, each separated by " - ". That way the actual contents of the groups can be ignored. Just match everything. A little dangerous because it can match titles you don't want to match. Be careful. Make sure you hava a backup and know how to restore it.

^ makes sure the match starts at the first character. Not really needed here, but makes it a little safer to avoid surprises.
.+ means match any sequence of chars, 1 long or longer.

So the pattern could be:
^(.+)\s-\s(.+)\s-\s(.+)

Replace with just the third group:
\3

Actually this would also work:
^(.+)\s-\s(.+)\s-\s

Replace with nothing. The parts that match are deleted.

Or this:
^overworld\s-\s.+\s-\s

Replace with nothing. A little safer, will only work on books with a title that starts with "overworld". Groups are not needed since you don't intend to reference any.

So the pattern could also be:
^.+\s-\s.+\s-\s(.+)

Replace with just the only group:
\1
Adoby is offline   Reply With Quote
Old 08-08-2013, 07:57 PM   #10
Rob557
Connoisseur
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 50
Karma: 810
Join Date: Jul 2012
Device: Kobo
Adoby's posting has prompted me to belatedly note a different "regular expressions" approach that I've used to extract the series number from the title field, when there is not a standard format for how the series number appears in the title field:

a) apply this to only the problem books and confirm there are no other numeric values in the title field other than the series number,

b) set "title" as the search field and "series_index" (or a temp work field) as the destination field, and use the regular expression [ a-zA-Z ( ) \[ \] \_ \- \& ,;:'"!&] which should hopefully get rid of everything except the numerical series number ... if the bulk edit won't run it means there's another sneaky character in there somewhere and your destination field only accepts numerical characters.

c) set "title" as the search field and a temp work field as the destination field (to be copied into the title field if all goes well) and use the regular expression [0-9] to get rid of the numerical series numbers (include \- if a dash accompanies the series numbers and does not otherwise appear in the various titles) ... a regular boolean search and replace can be used to get rid of extra spaces, although an extra space at the end or beginning of a field can be another complication.

Anyway, just another pragmatic approach.

Also a related reminder note about chaley's tip on how to reset a numeric field to be a blank field (although unrelated to bulk edits and regular expressions):
http://www.mobileread.com/forums/sho...d.php?t=186220
Rob557 is offline   Reply With Quote
Old 08-10-2013, 03:11 PM   #11
sequoia1
Junior Member
sequoia1 began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Aug 2010
Device: iphone, ebookwise, reb1200
For those who would wonder why ANYONE would put the series in the title as well as the appropriate series field? ... I redid my whole library to include it because I could never get the series to show up on any of our apple devices and had to manually rename within app. I know those who do not use apple devices do not seem to have this issue ... wish there was a way around it. Being I am only a "user" of calibre with no knowledge or understanding of programing the rest of this thread was beyond my comprehension. IT took me over a year to get the name output the way I wanted it to start with; even with many hours of reading the manual and forum responses.
sequoia1 is offline   Reply With Quote
Old 08-10-2013, 03:49 PM   #12
Adoby
Handy Elephant
Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.
 
Adoby's Avatar
 
Posts: 1,105
Karma: 5168844
Join Date: Dec 2009
Location: Southern Sweden, far out in the quiet woods
Device: Ubuntu Linux, Cybook Opus, Motorola Xoom with Mantano Premium
It is usually better to keep the title as it is while the books are stored in calibre. Instead use the plugboard and save to device/disk template features to add the series information to the titles on the fly when the books are sent to the device. That way you won't have to remove the series information from the titles in calibre, and the get metadata feature works as it should.
Adoby is offline   Reply With Quote
Old 08-10-2013, 03:58 PM   #13
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 64,009
Karma: 42472847
Join Date: Nov 2006
Location: UK
Device: PW2, iPad Retina Mini, iPhone 4, MS Surface Pro, Kobo H2O, N7
Wouldn't it be easier simply to restore the backup you've (presumably) made of your Calibre library? This is why we make backups.
HarryT is offline   Reply With Quote
Old 08-11-2013, 10:01 AM   #14
LadyKate
Groupie
LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.
 
Posts: 175
Karma: 673232
Join Date: Jul 2013
Location: Quebec CA
Device: android 4 (samsung tablet and asus tablet)
I think I'm a tad OCD about the way I keep my library.

After many false starts I made a good start at paring down my library and then did a major no no with renaming en mass (never make mass updates when overtired)

I went back to my backup but it was a couple of weeks old and had many many changes to the metadata that I needed to redo.

This made me start to write my metadata changes to the epubs using that oh so useful plugin :Modify ePub:

This ensures that my metadata is available for every single book whether I copy it to a device or to another library.

<Note: I did say I was OCD about my library. I've been collecting since the mid 90s and have way too many books to organize as I have just started using Calibre to cleanup way to may gig of books>
LadyKate is offline   Reply With Quote
Old 08-11-2013, 06:03 PM   #15
Merischino
Groupie
Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.Merischino ought to be getting tired of karma fortunes by now.
 
Merischino's Avatar
 
Posts: 183
Karma: 357868
Join Date: Jul 2010
Location: somewhere south of the mason dixon line
Device: Nexus 7 FHD (aka 2013, 2nd gen), Kindle 2, Samsung Galaxy s3
Thank you Adoby and Robb,
I think there's a lot of power here, but I just can't seem to wrap my head around it. I did try both sets of regular expressions but was not able to achieve results worthy of actual application (based on the in-dialog "test" results section).

For the otherworld example I went ahead and made the changes manually. There are still quite a number of series where I have this title/series issue and I will want to use the regular expression method, but I think I need to actually spend a day or so learning how to write my own, rather than trying to apply something given to me, since without the understanding behind it I can really screw everything up royally.

Harry T: thank you for reminding me to backup my calibre library.
LadyKate: I'm like you. I've just discovered the "modify epub" and "polish" plugins and they really are so much better than doing the whole "save cover and metadata in a separate file" thing I've been doing for the last umpteen years. I'm still trying to figure out which method I prefer, and which of the many sets of settings are the most useful for my purposes. But so far, it looks like embedding all my carefully applied downloaded/corrected metadata from authors to series to comments and tags to improved covers and etc... it doesn't make sense to continue saving these to exernal files, since re-importing my own files back into calibre i have to do all that work all over again. (never could figure out why I had to do things two and three times....)

yadda yadda I love this program. OT brain fart: Still wondering why it has never actually achieved a "full release" versioning aka 1.0? It's mature, well-loved, and constantly improved. Surely we can consider it out of alpha?
Merischino is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Plugboard "Metadata: Show series [series index] - title as title (Kindle)" Deep Cover Library Management 6 11-30-2012 05:17 PM
Bulk title replacements mumdigau Recipes 0 07-22-2012 12:34 PM
Bulk Title Mod Option?? kksdragons Library Management 1 10-27-2011 10:42 PM
Suggestion: Remove all tags button in the bulk edit screen Daemon Calibre 3 08-23-2010 06:58 AM
RFE: Remove remove tags in bulk edit magphil Calibre 0 08-11-2009 10:37 AM


All times are GMT -4. The time now is 10:28 PM.


MobileRead.com is a privately owned, operated and funded community.