Quote:
Originally Posted by ElMiko
So...
I'm cleanining up a book which has added title headings to the body of the text so that it looks like this:
Code:
<p>We were walking down the street when</p>
<p>THIS IS THE BOOK TITLE</p>
<p>we saw a squirrel sleeping in the middle of the road.</p>
Given the number of words in the title, and the fact that it is in all caps, this would generally be an easy fix. Unfortunately, the title has spaces thrown into it randomly so that it will look like:
Code:
THI S IS THE B OOK TITLE
or
THIS IS THE BO OK TI TLE
or
THIS I S THE BOOK TITLE
or
THIS IS THE B O O K TITLE
....etc
Is there any way to get match by matching the letters in the string while ignoring the spaces? And furthermore is it possible if the title is a mix of uppercase and lowercase?
|
are you trying to fix or remove this paragraph?
Uppercase only inside a p tag pair is fairly easy to trap and remove.
Mixed case garbage
Set Case Sensitive Mode
Code:
<p>([A-Z])?| )+</p>\s+
Note the vertical bar(space)
Not tested. use care. Abort (discard) if
should kill only the line with all caps and spaces