View Single Post
Old 11-17-2017, 12:45 PM   #7
Phssthpok
Age improves with wine.
Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.
 
Posts: 585
Karma: 95229
Join Date: Nov 2014
Device: Kindle Oasis, Kobo Libra II
Quote:
Originally Posted by mikapanja View Post
Yes, but what if there were spaces non-adjacent to the closing (or opening) tags?
And what if it was another pair of tags?

I.e. is there a generic regex which would find multiple spaces in any position between any opening and closing tags?
Find: (<(\w+)[^>]*>.*?) {2,}(.*?</\2>)
Replace with: \1 \3

This uses a real space rather than \s, as \s also matches newlines. Use [ \t] if you want tabs as well. The problem is that this will match the first tag containing multiple spaces, which will be <body>...</body>; better to do something like this, listing the tags you want to match:

Find: (<(p|div|blockquote|i)( [^>]*)?>.*?) {2,}(.*?</\2>)
Replace with: \1 \4

The use of ( [^>]*)? instead of [^>]* is to guard against matching e.g. <img> with </i>.
Phssthpok is offline   Reply With Quote