Thread: Regex examples
View Single Post
Old 08-05-2019, 12:54 AM   #592
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 47,022
Karma: 169810634
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by odamizu View Post
Another regex question if you will indulge me:

What's the difference between (?U) and ?

Is there an advantage to using (?U).* rather than .*? or vice versa?

Thank you!
If you are dealing with Unicode and using Python2, ?U would be useful (it enables Unicode for various options and makes ignorecase use non-ASCII matching) Likely documented elsewhere but check 7.2. re — Regular expression operations for more information. Please note that it is not the same as ? and is not used the same way -- (?u) says to treat the pattern and input as Unicode so it modifies how the input and pattern are treated but is not part of those strings.

So something like (?u)(.*?) instead of (.*?) if you want to match on Unicode.

OTOH, I vaguely remember that Python3 matches on Unicode by default making (?u) and it's equivalents (re.U, re.UNICODE) obsolete.
DNSB is offline   Reply With Quote