You can most likely use something like :
Code:
(?i)(?:^|\s+)(\d+\.?\d*?|[\D])
To grab all of the interesting first characters/numbers
Code:
string = r'Föô bár šjohka'
>>> regex.findall(string)
[u'F', u'b', u'\xe1']
I'm sure you can work it into a replacement without too much of a problem.