![]() |
It would be interesting to see the QChar values of the smart right single quoted word when it reaches the spellcheck code on Windows. This must be either a Qt specific bug in Windows or an encoding issue at some point as it works on both Linux and Mac.
I will eye-ball the code to see if I can find a suspect. |
I am betting the problem is here:
Code:
QString Utility::getSpellingSafeText(const QString &raw_text)u2019 in utf-8 is a 3 byte sequence: 0xE2 0x80 0x99 and so fromUtf8 routine should be passed that byte sequence or we load QChar with u2019 and then use toUtf8 to generate the input or better yet use the QChar directly. |
Let me know if there's anything you need me to try compiling and/or testing on Windows.
|
So a better way to write this might be:
return text.replace(QChar(0x2019),QChar(0x27)); DiapDealer, when you get a free moment, would you try that change in Misc/Utility.cpp in getSpellingSafeText and see if it makes any difference? Thanks |
Do you want me to push that change? It may not help, but certainly should not hurt.
|
Quote:
It also fixes the similar problem of adding words with smart-apostrophes to a user word-list (only adding a straight apos char would work previously). |
Glad to hear it! I will push it later this evening once I am back at my developer box.
|
Just pushed that fix to master.
|
Also, I have just pushed support for spellchecking words with numbers as controlled by a Sigil preference setting. That small change actually forced changes in many files and a ui dialog.
Please note, if your particular dictionary does not have any words with digits in them in their wordlist, this feature will not be of much help. This feature should appear in the next release unless I messed something up. |
Quote:
The only thing in the above mentioned situations that isn't covered (that I've noticed) is: Quote:
|
Words that have an internal normal dash (hyphen) should be spell checked properly given how the code handles them. If not, something is funny.
|
Quote:
|
The individual letters A, B, etc and the numbers after the hyphen are all valid standalone words so they are legal hyphenated. That said that Gbh-17 should show up as wrong since Gbh is not a valid word. This also depends of the wordchar list provided in the en_US.aff file (or whatever dictionary aff file you are using.
|
Quote:
Quote:
This wasn't necessarily about showing up as misspelled, it was about showing up in the list at all. For example: Code:
The Letter B, B-17 Bomber, and Room B9.When in reality, there is only 1 "B" + 1 "B-17" + 1 "B9". This becomes a serious issue when it happens to something common, like "A", or the Index/Footnote Example, where there can be hundreds of "A" + "n" + "ff" + "f" within the EPUB. It becomes impossible to use the Spellcheck List to locate/find and correct these. Or in the case of "l92l". That shows up at 2 "l". Good luck searching through every lowercase 'l' in the book trying to find it! |
Quote:
Quote:
|
| All times are GMT -4. The time now is 07:02 PM. |
Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.