Re: Unicode Issues, Comparing Strings, ISO Encoding, SQL and Other Maladies
After having a multitude of unicode issues dealing with comparison of strings between utf-8 and iso8859-15, and generally with creating sql statements using variables whose values originated in metadata.db and were causing runtime failures, I stumbled upon this little gem which I would like to share with the forum. I have not seen this syntax in any other documentation anywhere. See:
http://stackoverflow.com/questions/2...15-with-python
>>> a = 'ü'
>>> a.decode('utf8') # terminal is configured to use UTF-8 by default
u'\xfc'
>>> a.decode('utf8').encode('iso8859-15')
'\xfc'
So, the secret to keep Python 2 from "covertly" decoding to ascii before it re-encodes (or tries) to iso8859-15 (and hence losing all the non-ascii characters in the process, such as those in 'não-ficção') is to use this syntax:
>>>>>>
a.decode('utf8').encode('iso8859-15') <<<<<<<<<<