View Single Post
Old 11-25-2017, 08:07 AM   #1
calmeilles
Junior Member
calmeilles began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Nov 2017
Device: Kindle
Small suggestion for the search regex documentation

I was trying to search with the regex "[A-Z][A-Z][A-Z]" — looking for 3 consecutive capital letters but the search operated case-insensitive so I was getting results ABC, AbC, abc etc which came as a considerable surprise.

I believe that this is because my install has LOCALE=en_GB and the collate order for that is case insensitive. I may be wrong, but it's my best guess and actually doesn't matter.*

The Regex documentation page includes how to make a case sensitive [which is what we'd normally expect] ignore case with the "(?i)" syntax but not how to do the reverse. It actually turned out quite a chore tracking down what was required and a note in the documentation I feel would be useful.

What I ended up with was

Code:
(?-i:[A-Z]{3})
It's the possibility of -i that's missing and was quite obscure even in the Python docs.

(*I am curious if this is true or something else caused it.
If my guess is right then also mentioning that LOCALE can serious affect your regexes would also be useful.)
calmeilles is offline   Reply With Quote