More data:
In HTML/Entities.pm, » is defined as "»", while ’ and ” are chr(8217) and chr(8221), — is also chr(8212) and indeed it causes the same problem.
Maybe the existence of these non-latin1 (or whatever) characters in an HTML element causes the whole element to be encoded in unicode (or whatever), where the equivalent to (defined as "\240") is recognized as a space character. If the "space compacting" is done in latin1 encoding, the is not changed, but if it is done in unicode it is "compacted" in a space.
|