View Single Post
Old 03-08-2011, 04:48 PM   #8
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by Mixx View Post
But isn't the definition of \w an alphabetic character? Or is it ASCII alphabetic character?
Depending on the sorting order, ö is within the set [a-z] (a..oö..z) or or outside (a..o..z..ö). But I thought that is set by the LOCALE and I was delighted to be able to set Calibre (via a tweak) to other sort order than just ASCII. This was a major improvement for me.
I'd expect that Calibre/Python all read the LOCALE and interpret \w accordingly. Unless this is a Python issue, not a Calibre issue.
All regex handing is via Python's re module. Again you need to specify the proper flags, such as u. Otherwise the expected (Python) behavior is to include only a-z and A-Z as an alphebetic characters. See the Python re documentation for more information on this topic.
user_none is offline   Reply With Quote