Quote:
Originally Posted by jackie_w
I think you need Regex's advanced features 'lookahead' and 'lookbehind' to do what you want. I'm by no means expert in this but I think it would be something like
Code:
(?<=<i>)(.*)(?=</i>)
Google should find you more detailed info.
|
Wrong use of lookaround

that restricts you to just matching the text instead of the tags as well.
Useful feature, but not quite what the OP wanted.
...
davidfor's solution works, kind of... but the proper way to do it looks like this:
Code:
<i>((?:(?!</?i>).)*)</i>
This (the section in red) uses the power of negative lookaheads to ensure that the dot-match-everything can only match characters that are NOT followed by an "i" opening/closing tag.
...
The difference between mine and davidfor's solutions is that, given the sample text:
Code:
<i>sample <i>text</i></i>
Mine will match the single internal unit "<i>text</i>" whereas davidfor's will erroneously match from the first opening "i" tag until the first closing "i" tag.
This would matter a lot more if we were talking about a tag which is meant to be nested.

Like spans.
Which is why it is a very useful principle to know, although in this case you really can just make do with the easy solution.
credits: I originally learned this trick here:
https://stackoverflow.com/questions/.../406408#406408