Quote:
Originally Posted by ElMiko
did you mean there's an easy way to delete all div tags without writing reg-ex? As always, if i could impose on you to explain part of your code, too, I'd be most grateful. Specifically: "\b[^<>]*". Thanks
|
Nope, there's no direct XML manipulation like that in Sigil - I just meant that rather than replacing the div tags with their content, just deleting the tags themselves is easier.
\b[^<>]* is just a 'nice' ways of dealing with tags and attributes.
\b matches either end of a word. A word is just anything that matches \w+ generally.
The \b stops matches where there's only a partial tag, it's a habit from when you are searching for something like a <p> tag, you need to be careful to avoid <pre> tags.
Code:
<p([^<>]*)> // will match both <p yup="1"> and <pre something="wat">
<p\b[^<>]*> // will match p's but not pre's.
Using [^<>]*> rather than the more common [^>]*> is a measure to avoid destroying badly formatted tags, it's not a huge problem, but if a closing > has been removed by mistake, this will stop it matching the content and following tag(s).
Code:
Using the sample : <p Some text here</p>
</?p\b[^<>]*> : <p Some text here</p>
</?p\b[^>]*> : <p Some text here</p>
Not a very good example, but with nested tags, you can run into some pretty nasty stuff - can always avoid it by validating tho