![]() |
#1 |
Zealot
![]() ![]() Posts: 146
Karma: 194
Join Date: Jun 2010
Location: Melbourne
Device: iPad
|
Regular expression search in authors
Hi there.
I am trying to find all the books in my calibre library that have the surname, then a comma, then the first name. I have tried the following in the Advanced Search window. I select Regular Expression, and then in the drop down menu I select authors and then in the field I type: [\w],\s[\w] I have also tried [A-Za-Z], [A-Za-z] and hit return. But this does not work. Can someone please tell me how to achieve this? Thank you. |
![]() |
![]() |
![]() |
#2 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,375
Karma: 8012652
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Two issues:
- literal backslashes must be escaped with a backslash. - the comma is treated strangely, because the 'real' character in the field is a '|'. This is a bug. The following works today. Code:
authors:"~^\\w*\\, \\w" Code:
authors:"~^\\w*, \\w" |
![]() |
![]() |
![]() |
#4 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#5 | ||
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,375
Karma: 8012652
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
Quote:
Last edited by Starson17; 04-19-2011 at 11:15 AM. |
||
![]() |
![]() |
Advert | |
|
![]() |
#6 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Quote:
(Note: I edited your post to add a missing slash, so I could quote it.) Last edited by Starson17; 04-19-2011 at 11:23 AM. |
||
![]() |
![]() |
![]() |
#7 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,375
Karma: 8012652
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
That is my expectation. Character classes that don't depend on case should work without surprise. Classes that do depend on case will match both cases, even if the class contains only one of them.
|
![]() |
![]() |
![]() |
#8 |
Zealot
![]() ![]() Posts: 146
Karma: 194
Join Date: Jun 2010
Location: Melbourne
Device: iPad
|
Thanks so much!
This: authors:"~^\\w*\\, \\w" works like charm. Thanks again. However, I am confused. Why the two backslashes before the w? I have a reference guide that says (quote): \d, \w and \s are shorthand character classes ... can be used inside and outside character classes. So my thinking was that I need "any letter", followed by a comma, then a space, then any letter. Hence I should be able to use \w*,\s\w*. So despite the bug about the comma, this should work, shouldn't it. Last edited by kakkalla; 04-19-2011 at 08:31 PM. |
![]() |
![]() |
![]() |
#9 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
If I understood the matter correctly, you need to escape the backslashes for the search parser in order for them to stay in place as a single backslash for the regex parser.
|
![]() |
![]() |
![]() |
#10 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,375
Karma: 8012652
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
Backslashing things is always a problem because the backslash is often used to 'escape' other special characters by each piece of the processing chain. In calibre, there are two such pieces: the search language parser and the search engine. Each will process backslashes. The string passed to search is first processed by the search language parser. Backslashes are used as escapes: for example a \" means that the quote is part of the query and not the end of a query segment. Following fairly universal rules, all escaping backslashes are removed after processing. As such, \w is processed, determined to mean 'w', and the backslash is removed. The result is passed to the search engine, where additional escape processing is done, for example for regular expressions. Because of the dual processing, if you want to pass a real backslash to the search engine, you must escape it using a doubled backslash. Thus \\w instead of \w. |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Help with regular expression search/replace | bfollowell | Sigil | 12 | 06-20-2013 07:36 PM |
Regular Expression Help | Azhad | Calibre | 86 | 09-27-2011 02:37 PM |
Search & Replace - Regular expression | oldbwl | Calibre | 2 | 01-09-2011 09:33 AM |
Regular Expression Help | iKarampa | Calibre | 13 | 12-15-2010 07:17 AM |
Regular expression help | krendk | Calibre | 4 | 12-04-2010 04:32 PM |