View Single Post
Old 04-15-2011, 08:11 AM   #7
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Quote:
Originally Posted by user_none View Post
Use (?u) at the beginning of the pattern.

Code:
(?u)“
I'll need to look at why it's matching in the Regex Builder. It should not be. You need to manually specify you are using and to match on unicode characters.
(?u) only changes things like \w to use unicode character maps, I don't think it tells python the string itself is unicode. The string itself needs to be specified as a unicode string, but I'm pretty sure this happens automatically for these config variables.

Anyway my testing confirmed that unicode characters worked just fine:
Code:
(“|”)
Matched all the curly quotes and replaced them with straight quotes in my test (both in the regex builder and the actual conversion). I think something else must be the root cause.
ldolse is offline   Reply With Quote