Thread: Regex examples
View Single Post
Old 08-05-2019, 12:45 PM   #596
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,645
Karma: 204624552
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by DNSB View Post
@diagdealer: Thanks for the correction. I ran into the (?u) as being Unicode related playing with Python a while back and used ? for when I wanted ungreedy.
No problem. Easy enough assumption to make. All the short versions of the python regex flags align pretty well with the PCRE mode modifiers except re.U.

re.I (?i) ignore case.
re.S (?s) single line
re.M (?m) multiline

re.U turns on the unicode behavior of {\w , \W , \b , \B} in Python, but (?U) puts repetition characters in ungreedy mode in PCRE.

To turn on unicode support for operators like \w \W \d \b, etc in PCRE, you need to preface the expression with (*UCP). And that's if PCRE was compiled with unicode suport (which Sigil's PCRE clearly is).

Last edited by DiapDealer; 08-05-2019 at 01:02 PM.
DiapDealer is offline   Reply With Quote