Quote:
Originally Posted by kakkalla
Why doesn't the ^ tell the find to begin at the beginning of the line?
|
By default, PCRE treats the search subject as one
string. Meaning that for all intents and purposes, the contents of your html file is one, big string. Hence the ^ and $ metacharacters will only match the very beginning and the very end of that one, big
string respectively. Putting PCRE in multi-line mode by prefacing your expression with (?m) will then use ^ and $ to match the beginning and the ends of
lines in the manner you're expecting it to.
Code:
(?m)^<p class="(list.+)"><span class="(list.*)">(.+)</span>(.+)</p>
You could also forget multi-line mode (and ^ or $ entirely) and use a negative lookbehind to achieve a similar effect:
Code:
(?<!\>)<p class="(list.+)"><span class="(list.*)">(.+)</span>(.+)</p>
Meaning don't match the pattern if it's preceded by the closing '>' of another tag.
I highly recommend using something similar to the second method, since the common practice of indenting html code--for readability--could easily mess with your idea of what might constitute "the beginning" of a line.