Quote:
Originally Posted by pdurrant
So you want to find things like
<p class="calibre8"> <span class="calibre3"> — « […]
|
I think the OP wants the contrary, he wants to target the paragraph when the sentence
doesn't begin with <space><em-dash> (really, it was a bit confusing, I hope I understood correctly)
So since the expression that shall avoid the replacement is a group of 2 chars (i.e. <space>—), I'm afraid we need a negative lookbehind once more.
The thing is that a space at the beginning of the paragraph is useless, is not displayed, and is not recommended. If there wouldn't be any, it would have been easier.
Anyway, the way Reinsley displayed his sentences, I would propose this :
Code:
(<p class="calibre8"> <span class="calibre3">)(?<! — )(«[^»]+»,.*?\.) (« )
Replace:
\1\2</p>\n\n \1\3 (if you want a new paragraph)
\1\2<br/>\3 (if you just want a line break)
Note : to get rid of the space at the beginning of the 2nd line, I left it outside of all groups)
If you just want a <br/>, not a new <p>, there is a lighter variant (lighter in terms of RAM) with only one capturing group :
Code:
(?:<p class="calibre8"> <span class="calibre3">)(?<! — )(?:«[^»]+»,.*?\.)\K (« )
replace: <br/>\1
Explanation :
(?:<p class="calibre8"> <span class="calibre3">) --> (?:expr) is a non-capturing group
(?<! — ) --> Negative lookbehind: the regex fails if found
(?:«[^»]+»,.*?\.) --> Non-capturing group (the rest of the sentence)
\K --> Forget all that is before \K, and start the part to replace here
<space>(« ) --> <space> and group 1 (the space is out of the capturing group so it won't be retained)