View Single Post
Old 06-17-2022, 11:03 PM   #9
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by jordy1955 View Post
I did say that I'm pretty basic in my understanding of regex... I just realised what the \1 \2 does. Tested it and it works beautifully.

thankyou so much. You have saved me hours of manual intervention and frustration
The parentheses are "capture groups". And then "\1" is the first group, "\2" is the second and so on.

Another I have used recently was:

Code:
([[:lower:]])\s*</p>\s*<p>\s*([[:lower:]])
That is for when the paragraph ends in a lower case letter and the next starts with a lower case letter. Maybe with the spaces. For that, I am sure it is a paragraph that has been split. For the first one, I generally look at them to check what is actually intended.

And this one doesn't cater for the class. If I am doing this amount of fixing, I remove the class for the normal paragraph. If there are any left, it probably means there is other formatting that I probably don't want to lose.
davidfor is offline   Reply With Quote