03-02-2021, 08:22 PM | #1 |
Member
Posts: 15
Karma: 10
Join Date: Dec 2020
Device: epub
|
Spellcheck dialog shows well-spelled words as misspelled
In code view, word underlining works fine by adding it to misspelled words only. However, the spell check dialog also shows well-written words.
I do not know if I am configuring or using something wrong or if it is an problem with Sigil or my current epub. I had understood that adding the xml:lang="X" was enough for it to work properly. In the screenshot "informadas" is well-written. Is a test with only 2 words, but i'm working on a large book with a looooot of false positives, so the Spellchecker dialogue is almost unusable for me. I stay tuned to provide more information if necessary. Regards. ----------------------------------- Sigil 1.4.3 QT 5.15.2 Gentoo Linux Last edited by ebray187; 03-02-2021 at 08:27 PM. |
03-02-2021, 08:34 PM | #2 |
Member
Posts: 15
Karma: 10
Join Date: Dec 2020
Device: epub
|
Also i have set the <dc:language>es</dc:language> into the content.opf file
|
Advert | |
|
03-02-2021, 08:38 PM | #3 |
Sigil Developer
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
|
In Sigil's Preferences, what has you set for the default language, and what for the Primary and Secondary Dictionaries?
Do you have a proper hunspell spanish dictionary installed? |
03-02-2021, 08:52 PM | #4 |
Member
Posts: 15
Karma: 10
Join Date: Dec 2020
Device: epub
|
User Interface Language: English
Default Language For Metadata: Spanish On the primary language dictionary i have a custom spanish dictionary (aff and dic files inside the hunspell_dictionaries folder). I used this dictionary without problems since 0.6, 0.9.4, 1.2.1 and 1.3.0 Nothing on the second language. |
03-02-2021, 09:01 PM | #5 |
Member
Posts: 15
Karma: 10
Join Date: Dec 2020
Device: epub
|
Here is a zip with the epub and the dictionary
Last edited by ebray187; 03-28-2021 at 12:11 AM. |
Advert | |
|
03-02-2021, 09:06 PM | #6 |
Sigil Developer
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Does your custom dictionary follow the hunspell language naming convention:
es_ES.dic es_ES.aff Exactly how are the .aff and .dic files you are using are named? The spellcheck dialog does not use the Primary Dictionary per se. It instead depends on the normal naming convention to map language codes into dictionary names automatically. My guess is your custom dictionary is not being mapped to es because of its differs from the naming convention. Last edited by KevinH; 03-03-2021 at 10:51 AM. |
03-02-2021, 09:12 PM | #7 |
Sigil Developer
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Yep, how on earth is the SpellCheck Dialog which can now handle any number of languages and dictionaries supposed to know your .dic and .aff are for spanish given no language code at the start of the file names.
Rename them to match the expected convention - or use two symlinks using new names that match convention if you do not want to rename and restart Sigil and all should work. |
03-02-2021, 09:25 PM | #8 |
Member
Posts: 15
Karma: 10
Join Date: Dec 2020
Device: epub
|
Got it. Thanks!!
From a user point of view I think it would be useful to be able to specify which dictionary to use for X language by following the convention of the xml:lang (for "en", use this; for "es" this). In the current state it can be a bit misleading that the primary language dictionary setting is not what applies to the spellcheck language. Anyway thanks a lot for your help and for all the work making this amazing tool. |
03-02-2021, 10:27 PM | #9 |
Sigil Developer
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
|
On Windows and macOS, we supply the Hunspell dictionaries along with Sigil. So they of course follow the start with language code naming convention.
Even MySpell, the predecessor to HunSpell, and ispell the predecessor to MySpell, all follow that naming convention of starting with a language code. So there is a long long history on unix/linux for naming dictionaries so no one has to guess what language they are for especially as these dictionaries are shared across many apps. That is why there really is no need to have to associate language codes with dictionaries. It would just make them harder to use and share with repeated mappings needed for multiple apps. Sorry but Sigil will not be changing how we handle this. Glad to hear you got it working! |
03-03-2021, 04:27 AM | #10 | ||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
An EPUB's language goes like a pyramid:
1. Setting the content.opf correctly is the most important! And you did that great. This says "this book is in Spanish!" 2. (Optional) Add lang + xml:lang to your <html>: Chapter01.xhtml (Before): Code:
<html xmlns="http://www.w3.org/1999/xhtml"> Code:
<html xmlns="http://www.w3.org/1999/xhtml" lang="es" xml:lang="es"> 3. (Super duper optional) Mark "foreign words" with their language: Code:
<p>¿Puedo ir al bathroom, por favor?</p> Code:
<p>¿Puedo ir al <span lang="en" xml:lang="en">bathroom</span>, por favor?</p> Quote:
Basic idea is lang = HTML + xml:lang = XML. I explained a little more in this post a few days ago: "Search and Replace" (Post #11) And one common error that occurs is someone having the book + chapters be mismatched. (So a Spanish book "es", but you accidentally set English "en" in an HTML chapter. This is pretty easy to spot in the Language column in Tools > Spellcheck > Spellcheck.) Last edited by Tex2002ans; 03-03-2021 at 04:32 AM. |
||
03-03-2021, 09:46 AM | #11 |
Member
Posts: 15
Karma: 10
Join Date: Dec 2020
Device: epub
|
Thank you both for taking the time to respond.
I fully understand what you are saying and it certainly makes a lot of sense. The only thing I can add in this regard is that at least I did not find the information in the documentation (changelog or manual) that as of version 1.4 dictionaries must follow the Hunspell convention in their names to work correctly with Sigil (despite how well its explained in this post as a fundamental practice). I understand that in this same forum you are working on an update of the manual, so I take the opportunity to mention that it would be very helpful to include the information that you and others have kindly shared in this and other posts about it. The question that remains for me (and that I think was lost in the translation of my last message) is how a user can define which dictionary to use for each language. Perhaps it is not so common in English, but for example between Argentine Spanish and Spain Spanish there are many spelling differences. Before 1.4 I simply changed the dictionary in preferences, today I don't know how to do it from Sigil without altering the epub or constantly changing the file names of my dictionaries. Greetings and again, thank you very much for the time and patience. Last edited by ebray187; 03-03-2021 at 09:48 AM. |
03-03-2021, 09:54 AM | #12 |
Member
Posts: 15
Karma: 10
Join Date: Dec 2020
Device: epub
|
I mean taking into account that the Primary Language Dictionary does not affect the spellchecker but only the underlining of the code view.
Last edited by ebray187; 03-03-2021 at 09:59 AM. |
03-03-2021, 10:01 AM | #13 |
Sigil Developer
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Regional language differences just mean you use a languagecode that includes regions. There are many many variations of English but each has different region codes (as do the dictionaries).
en_US vs en_GB vs en_CA are different dictionaries targeted for US, Great Britain, and Canada respectively. The dictionaries names all start that way. The dc:language code allows the region to be used: en-US instead of just en The xml:lang and lang attributes also allow a region to be specified: "en-GB" instead of just "en". The language pulldown supports a large set of regional language codes. Here is a code snippet for just Spanish: Code:
"es" << tr("Spanish") << "es-AR" << tr("Spanish") + QString(" - ") + tr("Argentina") << "es-BO" << tr("Spanish") + QString(" - ") + tr("Bolivia") << "es-CL" << tr("Spanish") + QString(" - ") + tr("Chile") << "es-CO" << tr("Spanish") + QString(" - ") + tr("Columbia") << "es-CR" << tr("Spanish") + QString(" - ") + tr("Costa Rica") << "es-DO" << tr("Spanish") + QString(" - ") + tr("Dominican Republic") << "es-EC" << tr("Spanish") + QString(" - ") + tr("Ecuador") << "es-SV" << tr("Spanish") + QString(" - ") + tr("El Salvador") << "es-GT" << tr("Spanish") + QString(" - ") + tr("Guatemala") << "es-HN" << tr("Spanish") + QString(" - ") + tr("Honduras") << "es-MX" << tr("Spanish") + QString(" - ") + tr("Mexico") << "es-NI" << tr("Spanish") + QString(" - ") + tr("Nicaragua") << "es-PA" << tr("Spanish") + QString(" - ") + tr("Panama") << "es-PY" << tr("Spanish") + QString(" - ") + tr("Paraguay") << "es-PE" << tr("Spanish") + QString(" - ") + tr("Peru") << "es-PR" << tr("Spanish") + QString(" - ") + tr("Puerto Rico") << "es-ES" << tr("Spanish") + QString(" - ") + tr("Spain") << "es-UY" << tr("Spanish") + QString(" - ") + tr("Uruguay") << "es-VE" << tr("Spanish") + QString(" - ") + tr("Venezuela") << That said, if you would like to help edit the user-guide with additional information, we would be happy to consider it for inclusion. Hope this helps! |
03-03-2021, 10:11 AM | #14 |
Sigil Developer
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
|
The choice of Primary Language dictionary does impact the SpellCheck dialog and real time spell checking (red squiggley) in general.
If you chose en_GB as your Primary dictionary, then everywhere you use just "en" as a language code or xml:lang attribute will map to that dictionary over the en_US one. But it has to be able to determine the language of a dictionary from the long standing dictionary naming convention. In Code View the red sqiggley is determined completely by Primary and Secondary dictionaries chosen and it ignores any lang or xml:lang attributes as using lang and xml:lang is not an epub2 requirement if dc:language is set. This is useful when only one language is used as specified in dc:language and no where else or only a few words are taken from a second language. To support true multilanguage spell checking the use of xml:lang or lang attributes is required and what is recommended for epub3 / html5. |
03-03-2021, 10:20 AM | #15 |
Member
Posts: 15
Karma: 10
Join Date: Dec 2020
Device: epub
|
You explained it as an open book. Thanks!
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Export list of words in spellcheck | elchamaco | Sigil | 24 | 07-18-2019 10:22 AM |
Spellcheck Ignore Words | tetrault | Sigil | 4 | 02-11-2017 03:25 PM |
Unable to use spellcheck dictionary for italicizing words | sjhawar | Sigil | 18 | 10-20-2016 03:01 PM |
What's the most words you've written in a month? | John Carroll | Writers' Corner | 16 | 02-12-2013 06:15 AM |
Dialog boxes do not show text | =X= | Nook Developer's Corner | 5 | 01-03-2011 07:51 PM |