Quote:
Originally Posted by Gary_M_Mugford
Dalton,
Due to my predilection for Scandanavian mysteries, I find myself with ONE teensy request more, and that's for comparisons that ignore the extended character set diacriticals. Not sure whether to consolidate WITH the extended character attributes or whether just force everything back down to regular ascii.
Thanks, GM
|
@GM:
See the attached example from Version 1.0.20.
One word of caution: The greater that a particular metadata language's alphabet drifts from the Roman alphabet, the less accurate the new 'Compare as: Decomposed & Normalized Alphabet' Transform Function will become. Western European languages should (of course) work accurately, but Chinese, Japanese, Korean, Thai, and so forth will be (at best) much less accurate. The only Eastern European language I tested was Polish, and ĶźŽ was viewed as equal to KzZ for the purposes of searching, so it is likely that most of the Slavik languages will work well.
Obviously, if all of the metadata is properly spelled in a particular language, then of course the search will work perfectly. The issues arise when they do not.
For example, assume that the original title in Polish contained "ĶźŽŦ", but the translated title contained "KzZF". That would fail a check for equality, because the letter "Ŧ" does not transliterate to an "F". MCS would say they are different for that reason alone.
DaltonST