I've been using the following regex to abbreviate series names as initialisms:
Code:
\s*([a-zA-Z]|\d+\.?\d*)[a-z\']*\.?\s*
\1
Now that more & more of my series include unicode characters, I'm wondering if there is an easy way to either modify the [a-zA-Z] and [a-z'] terms to include appropriate accented characters, or to transliterate (transcode?) the string before regex processing.
Or is my best bet just to manually transcode my series? (yuck)