|
|
#1 |
|
Zealot
![]() Posts: 118
Karma: 12
Join Date: Apr 2010
Location: Melbourne, Australia
Device: Kobo Sage, Kobo Aura H2O, LG V20
|
Selecting words with middle dot · (Qt problem?)
Sigil seems to be providing me with lot of little annoyances this time around.
I’ve just noticed that word selection in CodeView has changed. I’m guessing this has something to do with Qt and so, effectively unfixable, but here goes. Previously I could double-click a word containing what Unicode calls “Middle Dot” (U+00b7) · and the whole word would be selected, but after updating only part of the word is selected. I have some keyboard macros that rely on that double click selection. The most common uses of this character (that I know of) are in Catalan (& Occitan) and in modern transcription of Old Irish. In both cases, the dot occurs inside words and does not mark a word boundary. It’s always a bit hard to know/remember/find out what other Qt programs I have, but I see I have QOwnNotes installed (Qt 6.9.3) and it shows the expected whole word selection. PageEdit, LibreOffice, Sublime Text and Calibre E-book Viewer select the whole word. Firefox (displaying the file exported from Sigil) does not. |
|
|
|
|
|
#2 |
|
Zealot
![]() Posts: 118
Karma: 12
Join Date: Apr 2010
Location: Melbourne, Australia
Device: Kobo Sage, Kobo Aura H2O, LG V20
|
Same behaviour with non-breaking hyphens.
In both cases Ctrl+Shift selection selects the whole word, but double-click does not. |
|
|
|
| Advert | |
|
|
|
|
#3 |
|
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 29,064
Karma: 211348980
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
|
|
|
|
|
|
#4 |
|
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,282
Karma: 6686152
Join Date: Nov 2009
Device: many
|
Is that Unicode “Middle Dot” (U+00b7) considered to be a member of the regular expression to match a word "\\w+" when the UnicodeProperty is set? That is how CodeView finds its word boundaries since the internal Qt functions fails to exclude all forms of quotes and does not follow unicode standards.
If it is not considered a unicode "word" character, it will be excluded as we now use QRegularExpression (\\w+), with UseUnicodeProperties set to extract the true unicode word out of the selected string of characters. So in CodeView type a word with that middle dot in it, then use Sigil's find and replace set for regex search (make sure the unicode property flag is set) using that search expression and use find to determine if that unicode char is deemed to be a word character or not. Update: According to this cite: https://codepoints.net/U+00B7?lang=en It is considered "inter-word" punctuation and its group is "Other Punctuation". It is not considered by this Unicode definition to be a character *inside* a word. (ie. inter not intra). You may be using it in some other way but according to official unicode properties it is not considered part of a word. Last edited by KevinH; 12-12-2025 at 10:53 AM. |
|
|
|
![]() |
| Thread Tools | Search this Thread |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Dot to Dot Outer Space Kid's Travel Book | Sunny Escape | Self-Promotions by Authors and Publishers | 1 | 08-06-2025 06:33 PM |
| Problem selecting the Dropbox library folder | piovac | Calibre Companion | 11 | 11-28-2015 04:39 AM |
| Question marks in the middle of words? | rockster | Calibre | 3 | 02-02-2013 04:58 PM |
| spaces introduced into middle of words in PDF conversion | paulrw | 1 | 11-06-2012 03:59 PM | |
| Problem online selecting metedata choice | danwdoo | Calibre | 2 | 03-21-2009 02:27 AM |