Thread started in
Saved Search/Regex Functions by
Alinara and continued here:
it seems to me that a new and dedicated thread is a better place, since the other thread is for general solutions that may apply to more people.
Quote:
Originally Posted by Alinara
I want to delete the first author in different books. The author is different in name, so I need to select every 1 value in the author list. But if I try search regex it always use the phrase on every value on the multivalue author colum.
Author a b ::: c d
search (\w+)\s(\w+)
replace \2
result b ::: d
wished c d
Can someone help me?
|
First of all, are you speaking about editing metadata in the calibre library or about modifying a single book in the editor of calibre ?
In the first case, you should post in
Library Management
In the second case, did I understood correctly your problem: you've got (in an html file of your epub) the text: <p>John Doe ::: Janet Doodle</p>
or: <p>John Albert Doe ::: Janet Eve Doodle</p>
and you want as a result :
<p>Janet Doodle</p> (or Janet Eve Doodle)
Is it what you were meaning ?
In that case, this works:
Code:
search: (?:\w+\s*)+::+\s*((?:\w+\s)+(?:\w*))
replace: \1
remark 1 : (?:exp) means a non-capturing group, the 1st capturing group (\1) is here the whole group *((?:\w+\s)+(?:\w*)) (itself made of 2 non-capturing groups)
Remark 2 : This is the global idea. You probably want to adapt the regex, for example :
- to capitalize the first letter of each word
- to match expressions only outside an html tag, as in:
>(?:\w+\s*)+::+\s*((?:\w+\s)+(?:\w*))<
- to match John A. Doe (it isn't catch by the actual regex)
- The 2nd author must be limited by an html tag or a non-letter character (i.e. punctuation) If not, you'll have to know how to stop the capturing group