I do a first pass like so:
Code:
text: <span class="italic">something</span>
replace: <span class="italic">([^<]+)</span>
with: <i>\1</i>
result: <i>something</i>
It doesn't work when there are other tags inside the <i> tags...
Code:
<span class="italic"><span class="bold">something</span></span>
... but a non-greedy replace usually cleans those up:
Code:
text: <span class="italic"><span class="bold">something</span></span>
replace: <span class="italic">(.*?)</span>
with: <i>\1</i>
result: <i><span class="bold">something</span></i>
replace: <span class="bold">(.*?)</span>
with: <b>\1</b>
result: <i><b>something</b></i>
A similar pattern is useful to isolate stuff in quotes:
Code:
text: <a id="return_from_note_1"></a><a href="notes.html#note_1">see note 1</a>
replace: <a id="([^"]+)"></a><a href="([^"]+)">(.*?)</a>
with: <a id="\1" href="\2">\3</a>
result: <a id="return_from_note_1" href="notes.html#note_1">see note 1</a>
... or for removing all attributes from uselessly-extravagant tags:
Code:
text: <i class="italic" style="margin:auto;padding:auto;font-size=1em;">something</i>
replace: <i[^>]+>something</i>
with: <i>\1</i>
result: <i>something</i>