Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 12-11-2025, 11:25 PM   #1
seanos
Zealot
seanos began at the beginning.
 
seanos's Avatar
 
Posts: 118
Karma: 12
Join Date: Apr 2010
Location: Melbourne, Australia
Device: Kobo Sage, Kobo Aura H2O, LG V20
Selecting words with middle dot · (Qt problem?)

Sigil seems to be providing me with lot of little annoyances this time around.


I’ve just noticed that word selection in CodeView has changed. I’m guessing this has something to do with Qt and so, effectively unfixable, but here goes.


Previously I could double-click a word containing what Unicode calls “Middle Dot” (U+00b7) · and the whole word would be selected, but after updating only part of the word is selected. I have some keyboard macros that rely on that double click selection.


The most common uses of this character (that I know of) are in Catalan (& Occitan) and in modern transcription of Old Irish. In both cases, the dot occurs inside words and does not mark a word boundary.


It’s always a bit hard to know/remember/find out what other Qt programs I have, but I see I have QOwnNotes installed (Qt 6.9.3) and it shows the expected whole word selection.


PageEdit, LibreOffice, Sublime Text and Calibre E-book Viewer select the whole word.


Firefox (displaying the file exported from Sigil) does not.
seanos is offline   Reply With Quote
Old 12-12-2025, 12:40 AM   #2
seanos
Zealot
seanos began at the beginning.
 
seanos's Avatar
 
Posts: 118
Karma: 12
Join Date: Apr 2010
Location: Melbourne, Australia
Device: Kobo Sage, Kobo Aura H2O, LG V20
Same behaviour with non-breaking hyphens.

In both cases Ctrl+Shift selection selects the whole word, but double-click does not.
seanos is offline   Reply With Quote
Advert
Old 12-12-2025, 07:22 AM   #3
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 29,064
Karma: 211348980
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
https://www.mobileread.com/forums/sh...d.php?t=369815
DiapDealer is offline   Reply With Quote
Old 12-12-2025, 09:01 AM   #4
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,282
Karma: 6686152
Join Date: Nov 2009
Device: many
Is that Unicode “Middle Dot” (U+00b7) considered to be a member of the regular expression to match a word "\\w+" when the UnicodeProperty is set? That is how CodeView finds its word boundaries since the internal Qt functions fails to exclude all forms of quotes and does not follow unicode standards.

If it is not considered a unicode "word" character, it will be excluded as we now use QRegularExpression (\\w+), with UseUnicodeProperties set to extract the true unicode word out of the selected string of characters.

So in CodeView type a word with that middle dot in it, then use Sigil's find and replace set for regex search (make sure the unicode property flag is set) using that search expression and use find to determine if that unicode char is deemed to be a word character or not.

Update:

According to this cite: https://codepoints.net/U+00B7?lang=en
It is considered "inter-word" punctuation and its group is "Other Punctuation". It is not considered by this Unicode definition to be a character *inside* a word. (ie. inter not intra).

You may be using it in some other way but according to official unicode properties it is not considered part of a word.

Last edited by KevinH; 12-12-2025 at 10:53 AM.
KevinH is online now   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Dot to Dot Outer Space Kid's Travel Book Sunny Escape Self-Promotions by Authors and Publishers 1 08-06-2025 06:33 PM
Problem selecting the Dropbox library folder piovac Calibre Companion 11 11-28-2015 04:39 AM
Question marks in the middle of words? rockster Calibre 3 02-02-2013 04:58 PM
spaces introduced into middle of words in PDF conversion paulrw PDF 1 11-06-2012 03:59 PM
Problem online selecting metedata choice danwdoo Calibre 2 03-21-2009 02:27 AM


All times are GMT -4. The time now is 06:47 PM.


MobileRead.com is a privately owned, operated and funded community.