View Single Post
Old 06-10-2012, 05:40 PM   #7
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,546
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
\p{L} or \p{Letter} or \pL is any unicode letter character. It will match things like Á or ö (as well as the "normal" [A-Za-z]).

\p{Lu} or \p{Uppercase_Letter} is an uppercase unicode letter character.
\p{Ll} or \p{Lowercase_Letter} is a lowercase unicode letter character

http://www.regular-expressions.info/unicode.html

I use \p{L) and its variants simply because I got tired of screwing up books that contained unicode characters where I least expected them. It's just a personal preference of mine to try and think in terms of unicode as much as possible.

I believe you can also prefix your regex expressions with (*UCP) to make them "unicode aware."
DiapDealer is online now   Reply With Quote