MobileRead Forums - View Single Post

Tex2002ans · 08-03-2019, 05:20 PM

Quote:

Originally Posted by odamizu

I get that .*? is limited to a single line, whereas [^>]+> isn't, and that's what messed me up here. But are there other advantages to using the [^x]+x construction over .*?

In plain English:

#1: [^>]+ = Any character that's NOT a '>' 1 or more times.

#2: .* = ANYTHING 0 or more times.

Note: The '?' at the end makes it "less greedy". So it tries to match the least amount.

So, what #1 is saying is:

Is the next character a '>'? No? Keep going until you hit a '>'.
If you hit the end of a line, keep going.

What #2 is saying is:

Is this anything? Yes? Keep going until you hit a '>'.
If you hit the end of a line, stop.

There are some advantages either method, but if you don't know enough about Regex, it'll probably just confuse you.

Just know that Regex #1 would be "less dangerous", and #2 could potentially be much more dangerous. Sometimes you accidentally Replace All and big chunks of your text got deleted because of some weird edge case. :P

Edit: Actually mixed up some explanations.