View Single Post
Old 03-08-2011, 06:54 PM   #12
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
(?u) should have worked, just doublechecked the docs - was this what you tried?:
Code:
(?u)(\w+), (\w+)
I'm not sure I would call \S+ the 'best' solution, it's a good solution given this specific problem, \S+? might be a bit better in case you were dealing with strings that had multiple commas. And Mixx is also correct that semantically \S and \w are quite different. The unicode flag is probably the most 'accurate' option.

I can't say that I'm a big fan of the Locale option after thinking about it - based on the Python regex docs that would work, but it would only work for one locale - if you had authors with non-ascii characters from other locales it wouldn't work - a common scenario for translated works.

Last edited by ldolse; 03-08-2011 at 06:57 PM.
ldolse is offline   Reply With Quote