MobileRead Forums - View Single Post - Non-English issues

KevinH · 09-14-2025, 05:22 PM

Quote:

Originally Posted by Moonbase59

It’s a shame that so many libraries and tools still hard-code just a few (too few!) values instead of relying on the well-defined Unicode properties. Let’s hope this gets better over time…

In general I agree. But unfortunately the unicode spec for determining a word boundary is a huge two dimensional table that still has lots of special cases. Thus a whole unicode library is needed for something that is quite straightforward and quite fast for most languages (is it a space, or quote, or punctuation).

See https://doc.qt.io/qt-6/qtextboundaryfinder.html

The world is eating up cpu cycles to support languages that are just too complex for their own good!

(As an aside can someone please explain why in German the word for a women's skirt is masculine! Babble is driving me crazy with "der Rock"

The unicode spec in my opinion is a classic example of what happens when you get a worldwide committee to design a spec!

I wish the computer world would standardize on one unicode support library (icu?) and make it available in all computer languages and string manipulation systems and build it into every OS. Until then we are stuck with half/partial implementations all over the place.