![]() |
#1 |
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 564
Karma: 32228
Join Date: Feb 2012
Device: Onyx Boox Leaf
|
[Regex Search] Minimal match not possible?
Dear you guys,
When I do a regex search for something, just say “(.*)”, Editor would select EVERY thing between the first “ and the last ” in a paragraph, if that paragraph includes multiple set of “(.*)”. This means that the selection is WRONG, because it includes texts in between two sets of “(.*)”. If the box "Dot all" is checked, the situation applies then not for a single paragraph but the whole file. In Sigil, there is a "Minimal Match" option, which would select the right set. ![]() ![]() ![]() ![]() ![]() ![]() |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,265
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
(.*?)
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Member
![]() Posts: 23
Karma: 10
Join Date: Apr 2014
Location: Paris
Device: ipad 2, Ubuntu
|
Look at the Python module "re" documentation at https://docs.python.org/2/howto/regex.html#regex-howto
and you shall read: the solution is to use the non-greedy qualifiers *?, +?, ??, or {m,n}?, which match as little text as possible. |
![]() |
![]() |
![]() |
#4 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Possibly a good idea to get used to doing this in the regex itself on general principle, rather than associating instruction sets with the regex.
![]() |
![]() |
![]() |
![]() |
#5 |
Member
![]() Posts: 21
Karma: 10
Join Date: Feb 2014
Device: Kobo Aura HD, Samsung Note II, Kindle and a few more
|
Additional request, @kovid or any expert at hand:
(.*?) is a bit hungry sometimes. I'd like to match anything but the tags: only characters, digits, punctuation marks and spaces but no <>. What's the trick? Thanks in advance. |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
The trick is don't use a dot-match-all symbol. Use a regex character class, like
Code:
[a-zA-Z0-9.?',"] Code:
[^<>] There are some very interesting yet obscure applications in the corners. Like this interesting use of negative lookarounds to find matching span tags, even when nested, and delete the matching sets: Code:
<span[^<>]*>((?:(?!<(?:/?span)).)*)</span> Last edited by eschwartz; 12-23-2014 at 04:51 PM. |
![]() |
![]() |
![]() |
#7 | |
Perfectionist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 72
Karma: 12802
Join Date: Apr 2014
Device: none
|
Quote:
(?<=<.*?>)(.*)(?=<.*?>) |
|
![]() |
![]() |
![]() |
#8 |
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 564
Karma: 32228
Join Date: Feb 2012
Device: Onyx Boox Leaf
|
Thank you, guys.
Each day I learn something. |
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Search and replace returns entire field when no match is found | wladdy | Calibre | 2 | 02-16-2014 01:51 AM |
Metadata Search & Replace - when it doesn't match | Aldebaranian | Library Management | 4 | 09-28-2011 11:35 AM |
how to have regex dot match any character including newline? | gnychis | Calibre | 5 | 11-30-2010 06:35 PM |
Need help with a conversion regex - can't match newline | ereader123 | Calibre | 2 | 03-29-2010 10:58 AM |
Search tags using exact match? | chaley | Calibre | 3 | 01-21-2010 01:16 PM |