Originally Posted by NiLuJe
@sherman: I'll be busy for the next few hours, so, go ahead .
Initial findings would point to the fact that the various c_index vs. chars_in_str checks are wrong, because c_index is the *byte* index in the string array, while chars_in_str is the amount of unicode "characters".
Since bytes >= unicode chars, drift slowly accumulates as more multi-byte characters are encountered.
Of lesser import, the md stuff is also massively not unicode aware, but we already knew that ;p.
|