@Manichean
There may actually be a bug that has sneaked into .8 code. I have been seeing
runtogetherwords, that were not so in the source HTML. mdash (? what char code was really used, no clue. cut/paste replacing with — worked in Sigil) was one.

ellipses is a good possibility for others
I tried different Heuristic settings and it at least passed the
unknown character through to be fixed in Sigil