View Single Post
Old 12-23-2014, 04:40 PM   #6
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
The trick is don't use a dot-match-all symbol. Use a regex character class, like
Code:
[a-zA-Z0-9.?',"]
Or match all but tag brackets:
Code:
[^<>]
I found this regex tutorial to be very helpful in learning the various fine points of regex: http://www.regular-expressions.info/

There are some very interesting yet obscure applications in the corners.

Like this interesting use of negative lookarounds to find matching span tags, even when nested, and delete the matching sets:

Code:
<span[^<>]*>((?:(?!<(?:/?span)).)*)</span>
The bit on the inside finds only text that does not include the arbitrary string "</?span", inside matching span tags.

Last edited by eschwartz; 12-23-2014 at 04:51 PM.
eschwartz is offline   Reply With Quote