![]() |
#1 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,088
Karma: 144284184
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Regex and span
Let's say we have the following in a line...
<p>This is just plain text. <span class="bold">This is supposed to be bold text.</span> This is just text. <span class="italics">This is supposed to be italics</span>. This is just more text.</p> So how would I create regex that would select the just text inside the first or second span? |
![]() |
![]() |
![]() |
#2 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 657
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
|
My regex fu must be on the blink.
This regex search should work, but it's not working in Sigil, does work in EditPad though, so perhaps it's a problem with the regex engine? Code:
(?<=<span\sclass="(bold|italics)">)(.+?)(?=</span>) Is that any use? EDIT: This works in Sigil (with your example). Search for Code:
(?<=">)(.+?)(?=</) Last edited by Perkin; 01-18-2013 at 09:28 PM. |
![]() |
![]() |
![]() |
#3 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 379
Karma: 65460
Join Date: Jun 2011
Device: Kindle
|
Like Perkins, i'm a little befuddled as to the technical reason why the first search fails (It's a problem i've wrestled with before). It's got something to do with a conflict between the "or" symbol and the parentheses. But as I say, the nitty gritty of the regex issue escapes me. In order to achieve the matching result intended by Perkins original search, you need to restate each span in its entirety:
Code:
(?<=span class="italics">|span class="bold">)(.*?)(?=</span) Code:
<span[^>]*>(.*?)</span> Last edited by ElMiko; 01-19-2013 at 02:13 PM. |
![]() |
![]() |
![]() |
#4 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,088
Karma: 144284184
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
An example of what I'd like to do is to take the uppercase of fake smallcaps and convert it to lowercase. So in cases where they've multiple spans on the same line, being able to select each one is important.
<p>Just some text. <span class="SmallCaps">SOME SMALLCAP TEXT</span>. Some more text. <span class="SmallCaps">SOME MORE SMALLCAP TEXT</span>.</p> Now I've seen stuff like this... <p>Just some text. <span class="italic">S<span class="SmallCaps">OME</span> S<span class="SmallCaps">MALLCAP TEXT</span>.</span></p> Can this be done? |
![]() |
![]() |
![]() |
#5 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,897
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
![]() Code:
<span class="SmallCaps">(.+?)</span> |
|
![]() |
![]() |
![]() |
#6 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 657
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
|
Expanding on theducks answer, to combine the Spans in second example, you could use
Code:
<span class="SmallCaps">(.+?)</span>([\sA-Z]*)<span class="SmallCaps">(.+?)</span> Code:
<span class="SmallCaps">\L\1\E\2\L\3</span> The trouble using these sorts of regex is when they contain a name or word which has a uppercase letter inside, which would then get converted to lowercase. Last edited by Perkin; 01-20-2013 at 07:08 AM. |
![]() |
![]() |
![]() |
#7 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,088
Karma: 144284184
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Would any of these regex examples work if there was just one span, two spans, three spans, or even four spans. The problem is that with something like smallcaps, it's not a set amount of spans. So there's no way to know ahead of time how many spans exist in a given paragraph. Plus there could be other spans for other things in the same paragraph.
|
![]() |
![]() |
![]() |
#8 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 657
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
|
theducks one (post #5) will only work on one span at a time, my 'combining' one (post #6) will work on two consecutive ones, you'll just need to repeat that one if there's more than the two consecutive spans.
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
how do I span more than one line with regex | BartB | Sigil | 3 | 12-11-2011 05:12 PM |
Average life span of a Kindle? | jbcohen | Amazon Kindle | 19 | 10-02-2011 08:19 PM |
Remove <br /> together with span, and only span | Razzia | Recipes | 3 | 05-30-2011 06:55 PM |
Trouble removing span class | mufc | Recipes | 3 | 03-18-2011 03:29 PM |
Span tags, h1s and emspaces | ConorHughes | ePub | 11 | 09-30-2010 05:00 PM |