The original news source looks like
and it shows up in the input dirctory of the debug pipeline as
where those three characters are the three UTF-8 byte codes that represent emdash. I'm looking at the input directory with MS Expression Web. I guess I'm not understanding how these three UTF-8 byte codes are supposed to get back to an emdash for display on a device via Mobipocket reader, Kindle, MS Expression Web, or anything else.