Quote:
Originally Posted by jordy1955
I did say that I'm pretty basic in my understanding of regex... I just realised what the \1 \2 does. Tested it and it works beautifully.
thankyou so much. You have saved me hours of manual intervention and frustration
|
The parentheses are "capture groups". And then "\1" is the first group, "\2" is the second and so on.
Another I have used recently was:
Code:
([[:lower:]])\s*</p>\s*<p>\s*([[:lower:]])
That is for when the paragraph ends in a lower case letter and the next starts with a lower case letter. Maybe with the spaces. For that, I am sure it is a paragraph that has been split. For the first one, I generally look at them to check what is actually intended.
And this one doesn't cater for the class. If I am doing this amount of fixing, I remove the class for the normal paragraph. If there are any left, it probably means there is other formatting that I probably don't want to lose.