Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 09-27-2011, 02:43 PM   #1
Aldebaranian
Junior Member
Aldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheese
 
Posts: 5
Karma: 1126
Join Date: Sep 2011
Device: none
Metadata Search & Replace - when it doesn't match

Hi all,

I've just started using calibre and very happy with it so far! However I am trying to tidy up my import of a bunch of ebooks acquired over some time and named rather chaotically. I try to do this using the Bulk Metadata function.

I have a fair few files and after importing some of them have been given author names of the form:

Case 1 <author> - <series> - <number>

while others might have

Case 2 <author> - <series>

Now, I wanted to avoid doing things by hand so I planned to use regex replacement to extract the series with an expression of the kind:

Author - ([^-]+) - ([0-9]+)

and then put \1 into series.

That works well for Case 1 above, but not for Case 2 because it places the entire author into \1.

Is there a way to tell calibre that when I do regex search and replace that if the regex _doesn't_ match I do not want anything inserted into the metadata? It seems a reasonable way to operate to me but it doesn't seem to be how it works. Thus I'm probably doing something silly

Thx!
Aldebaranian is offline   Reply With Quote
Old 09-27-2011, 04:09 PM   #2
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
You might want to make the second group optional, as in
Code:
Author - ([^-]+) ?-? ?([0-9]+)?
Manichean is offline   Reply With Quote
Advert
Old 09-27-2011, 05:15 PM   #3
Aldebaranian
Junior Member
Aldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheese
 
Posts: 5
Karma: 1126
Join Date: Sep 2011
Device: none
Hi,

Yes, the optional grouping works fine in some situations and by tweaking the regexp I can make it work for most titles. However the problem is I have quite a few books and I'd prefer not to go through them one by one (or in groups) and make sure the regexp works for them. I'd rather have a solution where I can write a generic regexp that silently skips any books that do not match.

Last edited by Aldebaranian; 09-27-2011 at 06:33 PM.
Aldebaranian is offline   Reply With Quote
Old 09-28-2011, 03:21 AM   #4
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,734
Karma: 6690881
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Search and replace does nothing to the contents of a field if the search matches nothing, instead passing the field through unchanged. This is consistent with the definition: search for X, replace X with Y, then write the results back to the field. If X is not found, then the field is passed through without modification to be made available to the post-search functions. That is why in your case the entire string is being passed to series. I am not willing to change this behavior. There is too much risk of seriously breaking something.

What you should do is ensure that you are operating on books that will match the pattern you are using. Yes, I recognize that this is what you say you don't want to do, but it isn't as hard as you might be thinking. You first search your library for the books that match the pattern, then you select all the results, then do the search/replace.

For example, first search (using the search bar) for
Code:
author:"~(.*) - ([^-]+) - ([0-9]+)"
This will find all the books that match the pattern, in this case books with an author containing two dashes surrounded by spaces. You then do the search/replace on the results, using the same pattern. In this case, \1 captures the author, \2 captures the series, and \3 captures the series index. You would do three search/replaces to fix the three fields, probably in the destination order series, series_index, author.

One caveat: Author is a multiple field. If you have a book of the form
Joe Blogs - Whazzup - 1 & John Doe, things get rather complicated.
chaley is offline   Reply With Quote
Old 09-28-2011, 11:35 AM   #5
Aldebaranian
Junior Member
Aldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheeseAldebaranian can extract oil from cheese
 
Posts: 5
Karma: 1126
Join Date: Sep 2011
Device: none
Thanks chaley,

That's clear - I certainly can understand that changing the default behaviour is unwanted. It could have been nice to have a toggle to change behaviour, but it is certainly not crucial - I was just wondering whether I was missing something.

I guess the simplest for me would be to do this with a separate script and calibredb instead - seems fairly easy (and more fun than clicking in the GUI .

Thanks!
Aldebaranian is offline   Reply With Quote
Advert
Reply

Tags
calibre, metadata, regular expressions


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Search & Replace :help: krussell Calibre 3 08-02-2011 04:45 PM
Metadata Search and Replace- Delete everything in square brackets? dkelly701 Library Management 5 02-20-2011 11:50 AM
Setting series index in bulk metadata search&replace bubak Calibre 4 12-19-2010 04:04 PM
Search & Replace Pat Nickholds Sigil 2 10-21-2010 11:18 PM
Search and Replace or remove Metadata jazzcat007 Calibre 5 05-21-2010 11:35 AM


All times are GMT -4. The time now is 03:54 AM.


MobileRead.com is a privately owned, operated and funded community.