Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 03-03-2015, 05:06 PM   #1
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
RegEx to replace only inside certain tags?

If there is something like

<h1>Text Text <b>Bold Text</b> Text Text</h1>

<p>Text Text <b>Bold Text</b> Text Text</p>

Q: is there a way to replace (delete) the <b> and </b> only if they are inside the h1 tags?

I got this far

Find: (<[Hh][1-6]>)(.+?)(<[biu]>)(.+?)(</\3>)(.+?)<(</[Hh][1-6]>)

Replace: \1\2\4\6\7

but it doesn't work because I can't figure out how to get the red back reference to match the blue tag

Q2: is there a better way?
phossler is offline   Reply With Quote
Old 03-03-2015, 05:21 PM   #2
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
The backreference works fine -- it just matches "(<[bui]>)"

Use "<([bui])>" instead, unless in fact you wish to match html which looks like ..."<b>some text</<b>>"...
eschwartz is offline   Reply With Quote
Advert
Old 03-03-2015, 05:47 PM   #3
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,889
Karma: 59840450
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
are you supposed to escape the \3
theducks is offline   Reply With Quote
Old 03-03-2015, 05:48 PM   #4
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 7,017
Karma: 90000009
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
That RE will also match the following text:

<h1>title</h1>this has <b>bold</b> text.<h2>subtitle</h2>
jhowell is offline   Reply With Quote
Old 03-03-2015, 06:27 PM   #5
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by theducks View Post
are you supposed to escape the \3
No -- you escape the 3 by saying \3 and making it special...

Quote:
Originally Posted by jhowell View Post
That RE will also match the following text:

<h1>title</h1>this has <b>bold</b> text.<h2>subtitle</h2>
True, which is why I prefer to search for rendered text with
Code:
[^<>]+
instead of
Code:
.+
But the main issue is capturing the right backreferences.
eschwartz is offline   Reply With Quote
Advert
Old 03-03-2015, 07:21 PM   #6
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
@eschwartz, theducks, jhowell

Thanks for the help!!

This is what I ended up with (so far) and it has of the advantage of apparently working

I don't understand yet the part in red, but I did incorporate your feedback and that made a big difference

Find: <([Hh][1-6])><([biu])>([^<>]+)(</\2>)</\1>
Replace: <\1>\3</\1>


So thanks again
phossler is offline   Reply With Quote
Old 03-03-2015, 08:24 PM   #7
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
The bit in red simply tells the regex, instead of looking for "all text", look for "all text as long as it doesn't have the < or > which signify tags".

To be really advanced, you would use a negative lookaround checking for specific tags, but at that point it might be worth moving on to some engine that can actually parse html.
eschwartz is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
regex cannot find to replace eschwartz Editor 11 02-03-2014 11:00 PM
Regex find and replace SanatyrZeo Sigil 5 10-29-2012 07:03 AM
regex replace??? schuster Conversion 14 01-29-2011 09:02 AM
RegEx find and replace iblesq Sigil 1 01-10-2011 09:26 PM
REGEX find and replace help please potestus Sigil 13 09-18-2010 04:14 PM


All times are GMT -4. The time now is 02:38 PM.


MobileRead.com is a privately owned, operated and funded community.