MobileRead Forums - View Single Post

capnm · 11-30-2011, 11:58 PM

Quote:

Originally Posted by Serpentine

Supply a sample(s) and expected result(s), make life easy.

Föô bár
Fb

Though that's pretty irrelevant. I'm not looking for debugging this particular regex, or to start adding tons of individual unicode characters to it.

I'm wondering if calibre's flavor of regex is/can be unicode aware, since I suspect some flavors of regex are, but I've never had occasion to explore the issue before.

Alternatively I thought there might be some calibre template functions that would transliterate a unicode string (though that would have other side effects).

Quote:

Originally Posted by dwanthny

Also where are you using this and why?

At the moment -- in custom columns and plugboards to abbreviate long series names.

But again, it's more of a general question, since at various times, for various reasons, authors, titles, series, etc., get plugged into regexps, and they all have the occasional unicode character which doesn't fall into the standard [a-zA-Z] or \w range.