View Single Post
Old 04-02-2014, 02:29 PM   #585
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by PeterT View Post
Why don't you give it a try? The challenge is how to match-up the correct <span> </span> pairs....
I'd usually search for
Code:
<span[^<>]*>([^<>]*)</span>
to get the innermost one. Once that is gone, a second pass should get rid of the outer one, shouldn't it? Unless there are other tags involved, in which case it only works some of the time.

So we'd have to use negative lookahead, as such:
Code:
(?:(?!<(?:span|/span)>).)*
to match the content inside span tags, without matching span inside span.

Final regex:
Code:
<span[^<>]*>((?:(?!<(?:span|/span)>).)*)</span>
EDIT: closing the span tag in the negative lookahead means we need to account for classes, so we will just look for and reject
Code:
<(span|/span)
which works as well, I think.

Final regex:
Code:
<span[^<>]*>((?:(?!<(?:span|/span)).)*)</span>

Last edited by eschwartz; 04-02-2014 at 02:40 PM.
eschwartz is offline   Reply With Quote