Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 01-18-2013, 08:43 PM   #1
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,938
Karma: 128903250
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Regex and span

Let's say we have the following in a line...

<p>This is just plain text. <span class="bold">This is supposed to be bold text.</span> This is just text. <span class="italics">This is supposed to be italics</span>. This is just more text.</p>

So how would I create regex that would select the just text inside the first or second span?
JSWolf is offline   Reply With Quote
Old 01-18-2013, 09:22 PM   #2
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 655
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
My regex fu must be on the blink.
This regex search should work, but it's not working in Sigil, does work in EditPad though, so perhaps it's a problem with the regex engine?

Code:
(?<=<span\sclass="(bold|italics)">)(.+?)(?=</span>)
The group that stores the text will be \2 (not the usual \1 - which here will be the bold|italics class)

Is that any use?

EDIT: This works in Sigil (with your example).
Search for
Code:
(?<=">)(.+?)(?=</)

Last edited by Perkin; 01-18-2013 at 09:28 PM.
Perkin is offline   Reply With Quote
Advert
Old 01-19-2013, 02:06 PM   #3
ElMiko
Addict
ElMiko actually enjoys Vogon poetry.ElMiko actually enjoys Vogon poetry.ElMiko actually enjoys Vogon poetry.ElMiko actually enjoys Vogon poetry.ElMiko actually enjoys Vogon poetry.ElMiko actually enjoys Vogon poetry.ElMiko actually enjoys Vogon poetry.ElMiko actually enjoys Vogon poetry.ElMiko actually enjoys Vogon poetry.ElMiko actually enjoys Vogon poetry.ElMiko actually enjoys Vogon poetry.
 
ElMiko's Avatar
 
Posts: 320
Karma: 56788
Join Date: Jun 2011
Device: Kindle
Like Perkins, i'm a little befuddled as to the technical reason why the first search fails (It's a problem i've wrestled with before). It's got something to do with a conflict between the "or" symbol and the parentheses. But as I say, the nitty gritty of the regex issue escapes me. In order to achieve the matching result intended by Perkins original search, you need to restate each span in its entirety:
Code:
(?<=span class="italics">|span class="bold">)(.*?)(?=</span)
@JSWolf—All that being said, your question is unclear. Are you looking for a search that will match either span or both spans? And if it's the latter, the above regex will do the trick. If it's the former, it will be too greedy, and you obviously have to limit the look-behind to the desired span class. Also, are you just looking to match "bold" and "italics" spans, or are you trying to match all spans? if the latter, then you may want to use:
Code:
<span[^>]*>(.*?)</span>
note that this search will (unlike the lookbehind searches) include the span code itself, as well as the text that it is modifying.

Last edited by ElMiko; 01-19-2013 at 02:13 PM.
ElMiko is offline   Reply With Quote
Old 01-19-2013, 07:44 PM   #4
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,938
Karma: 128903250
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
An example of what I'd like to do is to take the uppercase of fake smallcaps and convert it to lowercase. So in cases where they've multiple spans on the same line, being able to select each one is important.

<p>Just some text. <span class="SmallCaps">SOME SMALLCAP TEXT</span>. Some more text. <span class="SmallCaps">SOME MORE SMALLCAP TEXT</span>.</p>

Now I've seen stuff like this...

<p>Just some text. <span class="italic">S<span class="SmallCaps">OME</span> S<span class="SmallCaps">MALLCAP TEXT</span>.</span></p>

Can this be done?
JSWolf is offline   Reply With Quote
Old 01-19-2013, 08:12 PM   #5
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,792
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by JSWolf View Post
An example of what I'd like to do is to take the uppercase of fake smallcaps and convert it to lowercase. So in cases where they've multiple spans on the same line, being able to select each one is important.

<p>Just some text. <span class="SmallCaps">SOME SMALLCAP TEXT</span>. Some more text. <span class="SmallCaps">SOME MORE SMALLCAP TEXT</span>.</p>

Now I've seen stuff like this...

<p>Just some text. <span class="italic">S<span class="SmallCaps">OME</span> S<span class="SmallCaps">MALLCAP TEXT</span>.</span></p>

Can this be done?
Sure. (Having an example of the expected output for the shown input would help avoid confusion )

Code:
<span class="SmallCaps">(.+?)</span>
<span class="SmallCaps">\L\1</span>
theducks is offline   Reply With Quote
Advert
Old 01-20-2013, 07:01 AM   #6
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 655
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
Expanding on theducks answer, to combine the Spans in second example, you could use
Code:
<span class="SmallCaps">(.+?)</span>([\sA-Z]*)<span class="SmallCaps">(.+?)</span>
and replace with
Code:
<span class="SmallCaps">\L\1\E\2\L\3</span>
BEWARE
The trouble using these sorts of regex is when they contain a name or word which has a uppercase letter inside, which would then get converted to lowercase.

Last edited by Perkin; 01-20-2013 at 07:08 AM.
Perkin is offline   Reply With Quote
Old 01-22-2013, 08:39 AM   #7
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,938
Karma: 128903250
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Would any of these regex examples work if there was just one span, two spans, three spans, or even four spans. The problem is that with something like smallcaps, it's not a set amount of spans. So there's no way to know ahead of time how many spans exist in a given paragraph. Plus there could be other spans for other things in the same paragraph.
JSWolf is offline   Reply With Quote
Old 01-23-2013, 06:35 AM   #8
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 655
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
Quote:
Originally Posted by JSWolf View Post
Would any of these regex examples work if there was just one span, two spans, three spans, or even four spans.
theducks one (post #5) will only work on one span at a time, my 'combining' one (post #6) will work on two consecutive ones, you'll just need to repeat that one if there's more than the two consecutive spans.
Perkin is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
how do I span more than one line with regex BartB Sigil 3 12-11-2011 05:12 PM
Average life span of a Kindle? jbcohen Amazon Kindle 19 10-02-2011 08:19 PM
Remove <br /> together with span, and only span Razzia Recipes 3 05-30-2011 06:55 PM
Trouble removing span class mufc Recipes 3 03-18-2011 03:29 PM
Span tags, h1s and emspaces ConorHughes ePub 11 09-30-2010 05:00 PM


All times are GMT -4. The time now is 08:20 AM.


MobileRead.com is a privately owned, operated and funded community.