View Single Post
Old Yesterday, 12:52 PM   #4
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,937
Karma: 6361444
Join Date: Nov 2009
Device: many
Okay, I finally tracked the path back when a double-click happens when in a word in CodeView and that routine calls atWordSeparator in qtextengine.cpp which is simply the following:

Code:
bool QTextEngine::atWordSeparator(int position) const
{
    const QChar c = layoutData->string.at(position);
    switch (c.unicode()) {
    case '.':
    case ',':
    case '?':
    case '!':
    case '@':
    case '#':
    case '$':
    case ':':
    case ';':
    case '-':
    case '<':
    case '>':
    case '[':
    case ']':
    case '(':
    case ')':
    case '{':
    case '}':
    case '=':
    case '/':
    case '+':
    case '%':
    case '&':
    case '^':
    case '*':
    case '\'':
    case '"':
    case '`':
    case '~':
    case '|':
    case '\\':
        return true;
    default:
        break;
    }
    return false;
}
So no locale info is used, no unicode character classes are used, nothing for international support. This is simply horrible by Qt. They should be ashamed of that piece of code.

So you really should file a bug in Qt and let them know that they need to really fix their QTextEngine class defintion of atWordBoundary to use unicode character classes.

At least earlier in that routine they use QCharAttributes to determine if whitespace or not.

They have the unicode tools to do that: see qunicodetools.cpp

Code:
struct QCharAttributes
{
    uchar graphemeBoundary : 1;
    uchar wordBreak        : 1;
    uchar sentenceBoundary : 1;
    uchar lineBreak        : 1;
    uchar whiteSpace       : 1;
    uchar wordStart        : 1;
    uchar wordEnd          : 1;
    uchar mandatoryBreak   : 1;
};
And the code that triggers all of this is in QTextCursor select() function that uses these two routine snippets:

Code:
    case QTextCursor::EndOfWord: {
        QTextEngine *engine = layout->engine();
        const QCharAttributes *attributes = engine->attributes();
        const int len = blockIt.length() - 1;
        if (relativePos >= len)
            return false;
        if (engine->atWordSeparator(relativePos)) {
            ++relativePos;
            while (relativePos < len && engine->atWordSeparator(relativePos))
                ++relativePos;
        } else {
            while (relativePos < len && !attributes[relativePos].whiteSpace && !engine->atWordSeparator(relativePos))
                ++relativePos;
        }
        newPosition = blockIt.position() + relativePos;
        break;
    }

...

case QTextCursor::StartOfWord: {
        if (relativePos == 0)
            break;

        // skip if already at word start
        QTextEngine *engine = layout->engine();
        const QCharAttributes *attributes = engine->attributes();
        if ((relativePos == blockIt.length() - 1)
            && (attributes[relativePos - 1].whiteSpace || engine->atWordSeparator(relativePos - 1)))
            return false;

        if (relativePos < blockIt.length()-1)
            ++position;

        Q_FALLTHROUGH();
    }

Last edited by KevinH; Yesterday at 01:11 PM.
KevinH is online now   Reply With Quote