MobileRead Forums - View Single Post - Regex to find multiple spaces between HTML tags

Phssthpok · 11-17-2017, 12:45 PM

Quote:

Originally Posted by mikapanja

Yes, but what if there were spaces non-adjacent to the closing (or opening) tags?
And what if it was another pair of tags?

I.e. is there a generic regex which would find multiple spaces in any position between any opening and closing tags?

Find: (<(\w+)[^>]*>.*?) {2,}(.*?</\2>)
Replace with: \1 \3

This uses a real space rather than \s, as \s also matches newlines. Use [ \t] if you want tabs as well. The problem is that this will match the first tag containing multiple spaces, which will be <body>...</body>; better to do something like this, listing the tags you want to match:

Find: (<(p|div|blockquote|i)( [^>]*)?>.*?) {2,}(.*?</\2>)
Replace with: \1 \4

The use of ( [^>]*)? instead of [^>]* is to guard against matching e.g. <img> with </i>.