![]() |
#1 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
RegEx to replace only inside certain tags?
If there is something like
<h1>Text Text <b>Bold Text</b> Text Text</h1> <p>Text Text <b>Bold Text</b> Text Text</p> Q: is there a way to replace (delete) the <b> and </b> only if they are inside the h1 tags? I got this far Find: (<[Hh][1-6]>)(.+?)(<[biu]>)(.+?)(</\3>)(.+?)<(</[Hh][1-6]>) Replace: \1\2\4\6\7 but it doesn't work because I can't figure out how to get the red back reference to match the blue tag Q2: is there a better way? |
![]() |
![]() |
![]() |
#2 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
The backreference works fine -- it just matches "(<[bui]>)"
Use "<([bui])>" instead, unless in fact you wish to match html which looks like ..."<b>some text</<b>>"... |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,889
Karma: 59840450
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
are you supposed to escape the \3
![]() |
![]() |
![]() |
![]() |
#4 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,017
Karma: 90000009
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
|
That RE will also match the following text:
<h1>title</h1>this has <b>bold</b> text.<h2>subtitle</h2> |
![]() |
![]() |
![]() |
#5 | |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
No -- you escape the 3 by saying \3 and making it special...
Quote:
Code:
[^<>]+ Code:
.+ |
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
@eschwartz, theducks, jhowell
Thanks for the help!! This is what I ended up with (so far) and it has of the advantage of apparently working I don't understand yet the part in red, but I did incorporate your feedback and that made a big difference Find: <([Hh][1-6])><([biu])>([^<>]+)(</\2>)</\1> Replace: <\1>\3</\1> So thanks again |
![]() |
![]() |
![]() |
#7 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
The bit in red simply tells the regex, instead of looking for "all text", look for "all text as long as it doesn't have the < or > which signify tags".
To be really advanced, you would use a negative lookaround checking for specific tags, but at that point it might be worth moving on to some engine that can actually parse html. ![]() |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
regex cannot find to replace | eschwartz | Editor | 11 | 02-03-2014 11:00 PM |
Regex find and replace | SanatyrZeo | Sigil | 5 | 10-29-2012 07:03 AM |
regex replace??? | schuster | Conversion | 14 | 01-29-2011 09:02 AM |
RegEx find and replace | iblesq | Sigil | 1 | 01-10-2011 09:26 PM |
REGEX find and replace help please | potestus | Sigil | 13 | 09-18-2010 04:14 PM |