Thread: Reg-ex help...?
View Single Post
Old 12-05-2011, 04:13 PM   #5
Serpentine
Evangelist
Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.
 
Posts: 416
Karma: 1045911
Join Date: Sep 2011
Location: Cape Town, South Africa
Device: Kindle 3
Quote:
Originally Posted by ElMiko View Post
did you mean there's an easy way to delete all div tags without writing reg-ex? As always, if i could impose on you to explain part of your code, too, I'd be most grateful. Specifically: "\b[^<>]*". Thanks
Nope, there's no direct XML manipulation like that in Sigil - I just meant that rather than replacing the div tags with their content, just deleting the tags themselves is easier.

\b[^<>]* is just a 'nice' ways of dealing with tags and attributes.
\b matches either end of a word. A word is just anything that matches \w+ generally.
The \b stops matches where there's only a partial tag, it's a habit from when you are searching for something like a <p> tag, you need to be careful to avoid <pre> tags.
Code:
<p([^<>]*)> // will match both <p yup="1"> and <pre something="wat">
<p\b[^<>]*> // will match p's but not pre's.
Using [^<>]*> rather than the more common [^>]*> is a measure to avoid destroying badly formatted tags, it's not a huge problem, but if a closing > has been removed by mistake, this will stop it matching the content and following tag(s).
Code:
Using the sample : <p Some text here</p>
</?p\b[^<>]*> : <p Some text here</p>
</?p\b[^>]*> : <p Some text here</p>
Not a very good example, but with nested tags, you can run into some pretty nasty stuff - can always avoid it by validating tho

Last edited by Serpentine; 12-05-2011 at 04:16 PM. Reason: better example
Serpentine is offline   Reply With Quote