View Single Post
Old 02-16-2014, 08:07 AM   #1
arspr
Dead account. Bye
arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.
 
Posts: 587
Karma: 668244
Join Date: Mar 2011
Device: none
Stress tests with unicode "special" characters

Just for fun, I've started playing a little with unicode characters and I think I've found some "bugs"/possible enhancements.



First one. It really seems a true bug. Example given here (in Wikipedia>Unicode in Spanish). Some characters, like accented letters or Spanish "ñ" can be built as a single character ("ñ" is U+00F1) or as a combination of "n" (U+006E) plus a combining tilde (U+0303).

Ok, the problem is that if you do the second option, the next character is also rendered ABOVE that compound "ñ" in the Editor. Look at the attached screenshot where the "<" from "</p>" is rendered over the last "ñ". The first "ñ" is a true single one (U+00F1).



Second one. 50% of being a bug. If I type "ñ" in the search text box, I Find/Reaplace one match: the true "ñ" (U+00F1) but not the combined one. BUT if I Count All/Replace All, I get both matches. Incoherency here?



Third one. Clearly a feature request. Imagine you've solved issue #1. Nevertheless this kind of situations, (or the presence of a soft hyphen (U+00AD), and possibly of other "hidden" characters), is a PITA for editing, because you get no visual clue of their presence.

For example in the starting "cubierta" word I've added a soft hyphen which you cannot see. So if I try looking for "cubierta" I don't get any match...

If possible, I would like to have a toggle option much like the "paragraph icon" button in MS-Word. This button/preference would cause all this kind of "hidden" symbols to be explicitly rendered with arbitrary but related characters. (I say arbitrary because their "true" rendering is invisible). As possible examples:
  • Soft hyphens replaced by · (as this is how they usually appear in printed dictionaries), but with a "red" background to make it different from a true ·. (In a similar way as non breaking spaces and other symbols currently use a yellow background).
  • The combined "ñ" would be split into its components: a common "n" and a ~ after it, (also in red background to make it different from a true "~" symbol).

What do you think about these issues/possible enhancements?
Attached Thumbnails
Click image for larger version

Name:	Combining tilde.jpg
Views:	508
Size:	249.4 KB
ID:	119142  
arspr is offline   Reply With Quote