![]() |
#1 | |||
Connoisseur
![]() Posts: 75
Karma: 10
Join Date: Sep 2016
Device: Kindle
|
Regex: grabbing <h3><span> tag group
Need help with my regex expression to search <h3.*>(\w.+)</.*></h3> and replace it with <h3>\1</h3>
It was working and caught several of the sections I'm after, but stopped working on the rest. An example it failed on is the blurb below, which looks identical to first few (to me). I'm after the content "3. Escape". Complete blurb I'm looking at: Quote:
Quote:
Quote:
Last edited by meghane_e; 03-16-2019 at 04:22 PM. Reason: format |
|||
![]() |
![]() |
![]() |
#2 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,005
Karma: 57259778
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Why not use Diaps toolbag to delete spans with certain attributes (or naked) first to make things simpler?
|
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
Not Quite Dead
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 194
Karma: 654170
Join Date: Jul 2015
Device: Paperwhite 4; Galaxy Tab
|
The following regex seems to work fine with the code snippet you gave:
Quote:
Last edited by Brett Merkey; 03-16-2019 at 06:29 PM. |
|
![]() |
![]() |
![]() |
#4 | |
Connoisseur
![]() Posts: 75
Karma: 10
Join Date: Sep 2016
Device: Kindle
|
Quote:
Maybe I'm having a more basic problem of what work flow to use here. I'm flip-flopping between whether the Editor is a better way to edit it, or should I write a script/program to transform the HTMZ first. Is there a thread that discusses which work flow methods work better in various conditions? |
|
![]() |
![]() |
![]() |
#5 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,005
Karma: 57259778
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
![]() Yes, it matches the tag pair for the conditions supplied. It can be a bit tedious, but IMHO safer to do 1 condition at a time rather than do wild card (it will allow foot shooting ![]() span
style calibre\d+ REGEX mode |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Connoisseur
![]() Posts: 75
Karma: 10
Join Date: Sep 2016
Device: Kindle
|
Hmm, thank you for info. Not sure if I can use it right now, but sure know I'll probably need it later.
|
![]() |
![]() |
![]() |
#7 |
Junior Member
![]() Posts: 6
Karma: 10
Join Date: Sep 2014
Device: Android Cool Reader
|
A great site for testing Regex is regex101.com.
Try this: (<h3.*>)(\w[^<]+)(<.+>)(</h3>) The section [^<]+ tells it to take anything that is NOT a < whereas a . will keep going and take anything until it gets to the </tag> before the </h3> Of course, you can do what you want with the grouping. I just use it on that site to better see if each section is doing what I want. |
![]() |
![]() |
![]() |
#8 |
Connoisseur
![]() Posts: 75
Karma: 10
Join Date: Sep 2016
Device: Kindle
|
regex101.com looks like the most helpful test site I've come across! Good layout and usability! Thanks Joe! Still working my problem out though.
![]() |
![]() |
![]() |
![]() |
#9 |
Connoisseur
![]() Posts: 75
Karma: 10
Join Date: Sep 2016
Device: Kindle
|
Thanks again everyone! Well, this gets me the content I want, at least on reg101.com and in the Editor:
Given code: Code:
<h3 class="calibre_5"><span class="calibre3"><span class="bold"> <a href="http://www.noname.org/forums/story/david-mcleod/thetranslator/3" class="calibre2"><span class="calibre_1"> <span class="underline">3. Escape</span></span></a> </span></span></h3> (<h3[^>]+>(<[^>]+>)+)([^<]+)(([^>]+>)+(.*<\/h3>)) Replace with \3 But that seems overly complicated? Edit:Yay! (<h3.*>)(\w[^<]+)(<.+>)(<\/h3>) does work. There was a hidden newline in the generated code. Once I found it and took it out, it was happy Last edited by meghane_e; 03-28-2019 at 04:38 PM. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Span Span Span Sigil cleaning up indesign | blackest | Sigil | 31 | 12-06-2017 10:16 AM |
<Span> tag vs <I> tag | Sablerose | Editor | 22 | 01-15-2014 02:26 AM |
Is there RegEx to <span> ALL CAPS text? | phossler | Sigil | 4 | 03-10-2013 02:43 PM |
Regex and span | JSWolf | Sigil | 7 | 01-23-2013 06:35 AM |
how do I span more than one line with regex | BartB | Sigil | 3 | 12-11-2011 05:12 PM |