Thread: Regex examples
View Single Post
Old 02-21-2012, 08:08 AM   #18
Timur
Connoisseur
Timur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five words
 
Posts: 54
Karma: 37363
Join Date: Aug 2011
Location: Istanbul
Device: EBW1150, Nook STR
@DiapDealer: Does this narrow down your set enough? This one should match anything with at least one non-word(unicode) character in italics, including contractions but excluding empty spans(which should be easy enough to remove before- or afterwards.)

Code:
(*UCP)(?U)<span class="italic">[^<]*\W[^<]*</span>
If you do not want to miss absolutely anything(like nested spans) use .* instead of [^<]*. But you will probably match some unwanted multi-span matches.
Timur is offline   Reply With Quote