Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old Yesterday, 04:16 AM   #1
jugaor
Enthusiast
jugaor began at the beginning.
 
jugaor's Avatar
 
Posts: 35
Karma: 10
Join Date: Jun 2011
Location: Lima, Peru
Device: Kindle 10Gen / Kobo Aura HD / Nook STR
Non-English issues

Hello.
I use Sigil on Windows 11, both in Spanish.
Some time ago, I have noticed two behaviors that I hope can be fixed:

1. Text boundaries are inconsistent: for example, when double-clicking or using Ctrl+cursor, in cases like:
"abc" (abc) -abc-
it correctly selects from a to c (not the symbols).

But it does not recognize symbols from other languages as boundaries:
—With opening exclamation and question marks (necessary in Spanish) it also selects the first two.
¡abc! ¿abc?
—With European/Latin quotation mark variants
«abc» “abc” ‘abc’
and other symbols, it selects the word + both of them.
…abc… —abc— •abc•

I am leaving a barebones sample epub (examples taken from the “Quotation marks” page on Wikipedia), in case it is useful.
(I have deliberately removed language tags from opf/xhtml.)


2. In the Preview window, clicking Inspect Page always displays this message:
Spoiler:

But none of the three possibilities remain. The next session will display the 'Zod, Ursa & Non' buttons again.
Could Sigil save the user's choice? Or, at least, force the message to be disabled?

Attached Thumbnails
Click image for larger version

Name:	DevTools.jpg
Views:	137
Size:	56.9 KB
ID:	217997  
Attached Files
File Type: epub quotation marks and text boundaries [comillas y delimitadores de texto].epub (2.2 KB, 41 views)
jugaor is offline   Reply With Quote
Old Yesterday, 08:25 AM   #2
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,937
Karma: 6361444
Join Date: Nov 2009
Device: many
The ctrl-click in CodeView is not code controlled by Sigil. That code is built into Qt specifically the QtPlainTextEdit widget. What it includes or excludes should be controlled by your locale and what is determined by unicode to be punctuation? It is not under our direct control.

I will look to see if there is any workaround we could try.

And the Chrome inspector code is not ours to control either, it is built in to Qt's QtWebEngine. We do already allow QtWebengine to save to local-storage as specified in our QWebEngineProfile. As long as your Sigil Preferences folder is located where you have full write permission, all of that should work. So I have no idea why it is not saving things there. Perhaps downloading and installing the latest chrome browser and loading a page and firing up its developer mode inspector may help.

So both of these are really Qt bugs or changes. Perhaps you should file an official bug report with Qt so that these issues are addressed upstream?
KevinH is offline   Reply With Quote
Old Yesterday, 12:21 PM   #3
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,737
Karma: 206739468
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I can confirm that the Inspector language setting will not persist between Sigil sessions on Windows. The light/dark interface seems to though. But I don't know how much of that would be because of locales.

Last edited by DiapDealer; Yesterday at 12:26 PM.
DiapDealer is online now   Reply With Quote
Old Yesterday, 12:52 PM   #4
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,937
Karma: 6361444
Join Date: Nov 2009
Device: many
Okay, I finally tracked the path back when a double-click happens when in a word in CodeView and that routine calls atWordSeparator in qtextengine.cpp which is simply the following:

Code:
bool QTextEngine::atWordSeparator(int position) const
{
    const QChar c = layoutData->string.at(position);
    switch (c.unicode()) {
    case '.':
    case ',':
    case '?':
    case '!':
    case '@':
    case '#':
    case '$':
    case ':':
    case ';':
    case '-':
    case '<':
    case '>':
    case '[':
    case ']':
    case '(':
    case ')':
    case '{':
    case '}':
    case '=':
    case '/':
    case '+':
    case '%':
    case '&':
    case '^':
    case '*':
    case '\'':
    case '"':
    case '`':
    case '~':
    case '|':
    case '\\':
        return true;
    default:
        break;
    }
    return false;
}
So no locale info is used, no unicode character classes are used, nothing for international support. This is simply horrible by Qt. They should be ashamed of that piece of code.

So you really should file a bug in Qt and let them know that they need to really fix their QTextEngine class defintion of atWordBoundary to use unicode character classes.

At least earlier in that routine they use QCharAttributes to determine if whitespace or not.

They have the unicode tools to do that: see qunicodetools.cpp

Code:
struct QCharAttributes
{
    uchar graphemeBoundary : 1;
    uchar wordBreak        : 1;
    uchar sentenceBoundary : 1;
    uchar lineBreak        : 1;
    uchar whiteSpace       : 1;
    uchar wordStart        : 1;
    uchar wordEnd          : 1;
    uchar mandatoryBreak   : 1;
};
And the code that triggers all of this is in QTextCursor select() function that uses these two routine snippets:

Code:
    case QTextCursor::EndOfWord: {
        QTextEngine *engine = layout->engine();
        const QCharAttributes *attributes = engine->attributes();
        const int len = blockIt.length() - 1;
        if (relativePos >= len)
            return false;
        if (engine->atWordSeparator(relativePos)) {
            ++relativePos;
            while (relativePos < len && engine->atWordSeparator(relativePos))
                ++relativePos;
        } else {
            while (relativePos < len && !attributes[relativePos].whiteSpace && !engine->atWordSeparator(relativePos))
                ++relativePos;
        }
        newPosition = blockIt.position() + relativePos;
        break;
    }

...

case QTextCursor::StartOfWord: {
        if (relativePos == 0)
            break;

        // skip if already at word start
        QTextEngine *engine = layout->engine();
        const QCharAttributes *attributes = engine->attributes();
        if ((relativePos == blockIt.length() - 1)
            && (attributes[relativePos - 1].whiteSpace || engine->atWordSeparator(relativePos - 1)))
            return false;

        if (relativePos < blockIt.length()-1)
            ++position;

        Q_FALLTHROUGH();
    }

Last edited by KevinH; Yesterday at 01:11 PM.
KevinH is offline   Reply With Quote
Old Yesterday, 01:14 PM   #5
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,937
Karma: 6361444
Join Date: Nov 2009
Device: many
Given the above, there are not a lot of things we can do to workaround this issue. Perhaps we could strip off a list of additional characters built from QChairAttributes from each end, but that would really need a full unicode implementation of some sort.

Perhaps an env var that a user can set to indicate what chars it does not want when auto selecting a word and leave it up to the user to set it properly.

This is something to consider for a future release.
KevinH is offline   Reply With Quote
Old Yesterday, 01:16 PM   #6
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,937
Karma: 6361444
Join Date: Nov 2009
Device: many
Quote:
Originally Posted by DiapDealer View Post
I can confirm that the Inspector language setting will not persist between Sigil sessions on Windows. The light/dark interface seems to though. But I don't know how much of that would be because of locales.
Did it create anything in your Sigil Preferences folder in a "local-devtools" folder?

That is where we told the inspectors QWebEngineView to put its local storage.
KevinH is offline   Reply With Quote
Old Yesterday, 02:33 PM   #7
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,937
Karma: 6361444
Join Date: Nov 2009
Device: many
Okay, I tried rewriting the WebProfileMgr to explicitly create a QWebEngineProfile for our Inspector, but the local-devtools folder never appears to get used at all. If I change settings (the gear) in Inspector nothing is ever written to local-devtools or any place else I could find.

So again this is a bug in QtWebEngine. But as far as I can tell, we can not impact this without moving to Qt 6.9.2 and using their QWebProfileBuilder class to properly set the cache path, and return to the disk storage mode.

But that is something for a future release, as all of that requires we move to Qt 6.9.2 first and then heavily conditionalize the code to work back to Qt 6.4
KevinH is offline   Reply With Quote
Old Yesterday, 06:04 PM   #8
Moonbase59
Addict
Moonbase59 ought to be getting tired of karma fortunes by now.Moonbase59 ought to be getting tired of karma fortunes by now.Moonbase59 ought to be getting tired of karma fortunes by now.Moonbase59 ought to be getting tired of karma fortunes by now.Moonbase59 ought to be getting tired of karma fortunes by now.Moonbase59 ought to be getting tired of karma fortunes by now.Moonbase59 ought to be getting tired of karma fortunes by now.Moonbase59 ought to be getting tired of karma fortunes by now.Moonbase59 ought to be getting tired of karma fortunes by now.Moonbase59 ought to be getting tired of karma fortunes by now.Moonbase59 ought to be getting tired of karma fortunes by now.
 
Moonbase59's Avatar
 
Posts: 234
Karma: 1000244
Join Date: Oct 2021
Location: Germany
Device: Tolino Vision 5, Tolino Tab 8", Pocketbook Era (16GB)
It’s a shame that so many libraries and tools still hard-code just a few (too few!) values instead of relying on the well-defined Unicode properties. Let’s hope this gets better over time…
Moonbase59 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Informal English usage and possible translation issues j.p.s General Discussions 91 01-30-2024 06:08 AM
Font issues - Chinese, Japanese, English Elfwreck PocketBook 2 06-30-2023 11:07 PM
Having issues while converting non-English books shraddhajadhav7 Conversion 2 01-12-2022 09:09 AM
Problem: Calibre converts non-English titles to English equivalents. Fritz_Katz Conversion 2 05-18-2021 07:06 PM
PB302 - How to replace English->Russian dictionary with English only (with defin.)? guyanonymous PocketBook 29 08-03-2010 06:05 PM


All times are GMT -4. The time now is 05:33 PM.


MobileRead.com is a privately owned, operated and funded community.