MobileRead Forums - View Single Post - Mathch a string while ignoring some character in that string?

theducks · 12-01-2011, 01:11 PM

Quote:

Originally Posted by ElMiko

So...

I'm cleanining up a book which has added title headings to the body of the text so that it looks like this:

Code:

<p>We were walking down the street when</p>

<p>THIS IS THE BOOK TITLE</p>

<p>we saw a squirrel sleeping in the middle of the road.</p>

Given the number of words in the title, and the fact that it is in all caps, this would generally be an easy fix. Unfortunately, the title has spaces thrown into it randomly so that it will look like:

Code:

THI S IS THE B OOK TITLE
or
THIS IS THE BO OK TI TLE
or
THIS I S THE BOOK TITLE
or
THIS IS THE B O O K TITLE
....etc

Is there any way to get match by matching the letters in the string while ignoring the spaces? And furthermore is it possible if the title is a mix of uppercase and lowercase?

are you trying to fix or remove this paragraph?
Uppercase only inside a p tag pair is fairly easy to trap and remove.
Mixed case garbage

Set Case Sensitive Mode

Code:

<p>([A-Z])?| )+</p>\s+

Note the vertical bar(space)
Not tested. use care. Abort (discard) if

should kill only the line with all caps and spaces