Quote:
Originally Posted by nickredding
I'm having a problem with a news feed that has emdash's included literally (instead of using — ) and they are being handled as follows: the emdash is recognized as such, and translated into a unicode emdash (u2014) which then turns up in the output as the UTF-8 equivalent (0xE2 0x80 0x94) and is displayed as — which is the CP1252 interpretation of those three character codes. I can't figure out how to fix this -- preprocess_regexps doesn't work. Can anyone help?
|
Would you mind sending your code?