View Single Post
Old 06-20-2013, 05:53 AM   #10
Funslinger
Member
Funslinger began at the beginning.
 
Posts: 10
Karma: 10
Join Date: Jun 2012
Device: Kobo Touch
Quote:
Originally Posted by cybmole View Post
yes, the solutions given only work if open quote looks different to / can be distinguished from closing quote

and quotes within quotes is a whole new ball game!
This situation can be handled fairly easily using recursion to match opening and closing quotes.

I don't know if the regular expression engine in the text editor TextMate is the same as the one in Sigil. But the following regular expression will find a string consisting of an entire html element in TextMate.

<\?xml[^>]+>|<!DOCTYPE(?:[^\]]*]>|[^>]*>)|<[^/ >]+[^>]*/>|<(?<tagname>[^/ >]+)[^>]*>(?<!/>)(?<html>[^<]|<[^/ >]+[^>]*/>|<(?<tagname>[^/ >]+)[^>]*>(?<!/>)\g<html>*</\k<tagname+0>>)*</\k<tagname+0>>


example: take the following string of text.

<p>This is an <i>example</i> paragraph.</p><p>This is a second paragraph.</p>

If the cursor is at the beginning of the text, the regular expression will match <p>This is an <i>example</i> paragraph.</p>. If the cursor is after the first < and not after the second <, it will match <i>example</i>. If the cursor is after the second <, it will match <p>This is a second paragraph.</p>

In other words, it matches the first opening html tag encountered with its appropriate closing tag. But it will only work on properly formatted html. For example, in this improperly formatted html string

<p>This is the first paragraph<p>This is the second paragraph</p>

it will not match the first paragraph because the first closing tag </p> is missing.

The regular expression can handle tags that close themselves like <p/> or <div/> or <link href="my.css" type="text/css" rel="stylesheet"/> or <a name="chap4" id="chap4"/>.

Last edited by Funslinger; 06-20-2013 at 05:58 AM.
Funslinger is offline   Reply With Quote