Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 03-02-2021, 08:22 PM   #1
ebray187
Member
ebray187 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Dec 2020
Device: epub
Spellcheck dialog shows well-spelled words as misspelled

In code view, word underlining works fine by adding it to misspelled words only. However, the spell check dialog also shows well-written words.
I do not know if I am configuring or using something wrong or if it is an problem with Sigil or my current epub. I had understood that adding the xml:lang="X" was enough for it to work properly.



In the screenshot "informadas" is well-written. Is a test with only 2 words, but i'm working on a large book with a looooot of false positives, so the Spellchecker dialogue is almost unusable for me.

I stay tuned to provide more information if necessary.

Regards.

-----------------------------------
Sigil 1.4.3
QT 5.15.2
Gentoo Linux

Last edited by ebray187; 03-02-2021 at 08:27 PM.
ebray187 is offline   Reply With Quote
Old 03-02-2021, 08:34 PM   #2
ebray187
Member
ebray187 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Dec 2020
Device: epub
Also i have set the <dc:language>es</dc:language> into the content.opf file
ebray187 is offline   Reply With Quote
Advert
Old 03-02-2021, 08:38 PM   #3
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
In Sigil's Preferences, what has you set for the default language, and what for the Primary and Secondary Dictionaries?

Do you have a proper hunspell spanish dictionary installed?
KevinH is online now   Reply With Quote
Old 03-02-2021, 08:52 PM   #4
ebray187
Member
ebray187 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Dec 2020
Device: epub
User Interface Language: English
Default Language For Metadata: Spanish

On the primary language dictionary i have a custom spanish dictionary (aff and dic files inside the hunspell_dictionaries folder). I used this dictionary without problems since 0.6, 0.9.4, 1.2.1 and 1.3.0

Nothing on the second language.
ebray187 is offline   Reply With Quote
Old 03-02-2021, 09:01 PM   #5
ebray187
Member
ebray187 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Dec 2020
Device: epub
Here is a zip with the epub and the dictionary

Last edited by ebray187; 03-28-2021 at 12:11 AM.
ebray187 is offline   Reply With Quote
Advert
Old 03-02-2021, 09:06 PM   #6
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
Does your custom dictionary follow the hunspell language naming convention:

es_ES.dic
es_ES.aff

Exactly how are the .aff and .dic files you are using are named? The spellcheck dialog does not use the Primary Dictionary per se. It instead depends on the normal naming convention to map language codes into dictionary names automatically.

My guess is your custom dictionary is not being mapped to es because of its differs from the naming convention.

Last edited by KevinH; 03-03-2021 at 10:51 AM.
KevinH is online now   Reply With Quote
Old 03-02-2021, 09:12 PM   #7
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
Yep, how on earth is the SpellCheck Dialog which can now handle any number of languages and dictionaries supposed to know your .dic and .aff are for spanish given no language code at the start of the file names.

Rename them to match the expected convention - or use two symlinks using new names that match convention if you do not want to rename and restart Sigil and all should work.
KevinH is online now   Reply With Quote
Old 03-02-2021, 09:25 PM   #8
ebray187
Member
ebray187 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Dec 2020
Device: epub
Got it. Thanks!!

From a user point of view I think it would be useful to be able to specify which dictionary to use for X language by following the convention of the xml:lang (for "en", use this; for "es" this). In the current state it can be a bit misleading that the primary language dictionary setting is not what applies to the spellcheck language.

Anyway thanks a lot for your help and for all the work making this amazing tool.
ebray187 is offline   Reply With Quote
Old 03-02-2021, 10:27 PM   #9
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
On Windows and macOS, we supply the Hunspell dictionaries along with Sigil. So they of course follow the start with language code naming convention.

Even MySpell, the predecessor to HunSpell, and ispell the predecessor to MySpell, all follow that naming convention of starting with a language code.

So there is a long long history on unix/linux for naming dictionaries so no one has to guess what language they are for especially as these dictionaries are shared across many apps.

That is why there really is no need to have to associate language codes with dictionaries. It would just make them harder to use and share with repeated mappings needed for multiple apps.

Sorry but Sigil will not be changing how we handle this.

Glad to hear you got it working!
KevinH is online now   Reply With Quote
Old 03-03-2021, 04:27 AM   #10
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by ebray187 View Post
Also i have set the <dc:language>es</dc:language> into the content.opf file


An EPUB's language goes like a pyramid:
  • book-level = content.opf
  • chapter-level = <html>
  • word-level = <span>

1. Setting the content.opf correctly is the most important! And you did that great.

This says "this book is in Spanish!"

2. (Optional) Add lang + xml:lang to your <html>:

Chapter01.xhtml (Before):

Code:
<html xmlns="http://www.w3.org/1999/xhtml">
Chapter01.xhtml (After):

Code:
<html xmlns="http://www.w3.org/1999/xhtml" lang="es" xml:lang="es">
This says "this chapter is Spanish!"

3. (Super duper optional) Mark "foreign words" with their language:

Code:
<p>¿Puedo ir al bathroom, por favor?</p>
Code:
<p>¿Puedo ir al <span lang="en" xml:lang="en">bathroom</span>, por favor?</p>
This says "the entire book/chapter/sentence is in Spanish, but the word 'bathroom' is English!"

Quote:
Originally Posted by ebray187 View Post
I had understood that adding the xml:lang="X" was enough for it to work properly.
Any time you're marking which language, it's good practice to use BOTH lang + xml:lang.

Basic idea is lang = HTML + xml:lang = XML.

I explained a little more in this post a few days ago:

"Search and Replace" (Post #11)

And one common error that occurs is someone having the book + chapters be mismatched.

(So a Spanish book "es", but you accidentally set English "en" in an HTML chapter. This is pretty easy to spot in the Language column in Tools > Spellcheck > Spellcheck.)

Last edited by Tex2002ans; 03-03-2021 at 04:32 AM.
Tex2002ans is offline   Reply With Quote
Old 03-03-2021, 09:46 AM   #11
ebray187
Member
ebray187 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Dec 2020
Device: epub
Thank you both for taking the time to respond.

I fully understand what you are saying and it certainly makes a lot of sense. The only thing I can add in this regard is that at least I did not find the information in the documentation (changelog or manual) that as of version 1.4 dictionaries must follow the Hunspell convention in their names to work correctly with Sigil (despite how well its explained in this post as a fundamental practice).

I understand that in this same forum you are working on an update of the manual, so I take the opportunity to mention that it would be very helpful to include the information that you and others have kindly shared in this and other posts about it.

The question that remains for me (and that I think was lost in the translation of my last message) is how a user can define which dictionary to use for each language. Perhaps it is not so common in English, but for example between Argentine Spanish and Spain Spanish there are many spelling differences. Before 1.4 I simply changed the dictionary in preferences, today I don't know how to do it from Sigil without altering the epub or constantly changing the file names of my dictionaries.

Greetings and again, thank you very much for the time and patience.

Last edited by ebray187; 03-03-2021 at 09:48 AM.
ebray187 is offline   Reply With Quote
Old 03-03-2021, 09:54 AM   #12
ebray187
Member
ebray187 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Dec 2020
Device: epub
Quote:
Originally Posted by ebray187 View Post
Before 1.4 I simply changed the dictionary in preferences, today I don't know how to do it from Sigil without altering the epub or constantly changing the file names of my dictionaries.
I mean taking into account that the Primary Language Dictionary does not affect the spellchecker but only the underlining of the code view.

Last edited by ebray187; 03-03-2021 at 09:59 AM.
ebray187 is offline   Reply With Quote
Old 03-03-2021, 10:01 AM   #13
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
Regional language differences just mean you use a languagecode that includes regions. There are many many variations of English but each has different region codes (as do the dictionaries).

en_US vs en_GB vs en_CA are different dictionaries targeted for US, Great Britain, and Canada respectively. The dictionaries names all start that way.

The dc:language code allows the region to be used: en-US instead of just en

The xml:lang and lang attributes also allow a region to be specified: "en-GB" instead of just "en".

The language pulldown supports a large set of regional language codes. Here is a code snippet for just Spanish:

Code:
         "es"    << tr("Spanish") <<
         "es-AR" << tr("Spanish") + QString(" - ") + tr("Argentina") <<
         "es-BO" << tr("Spanish") + QString(" - ") + tr("Bolivia") <<
         "es-CL" << tr("Spanish") + QString(" - ") + tr("Chile") <<
         "es-CO" << tr("Spanish") + QString(" - ") + tr("Columbia") <<
         "es-CR" << tr("Spanish") + QString(" - ") + tr("Costa Rica") <<
         "es-DO" << tr("Spanish") + QString(" - ") + tr("Dominican Republic") <<
         "es-EC" << tr("Spanish") + QString(" - ") + tr("Ecuador") <<
         "es-SV" << tr("Spanish") + QString(" - ") + tr("El Salvador") <<
         "es-GT" << tr("Spanish") + QString(" - ") + tr("Guatemala") <<
         "es-HN" << tr("Spanish") + QString(" - ") + tr("Honduras") <<
         "es-MX" << tr("Spanish") + QString(" - ") + tr("Mexico") <<
         "es-NI" << tr("Spanish") + QString(" - ") + tr("Nicaragua") <<
         "es-PA" << tr("Spanish") + QString(" - ") + tr("Panama") <<
         "es-PY" << tr("Spanish") + QString(" - ") + tr("Paraguay") <<
         "es-PE" << tr("Spanish") + QString(" - ") + tr("Peru") <<
         "es-PR" << tr("Spanish") + QString(" - ") + tr("Puerto Rico") <<
         "es-ES" << tr("Spanish") + QString(" - ") + tr("Spain") <<
         "es-UY" << tr("Spanish") + QString(" - ") + tr("Uruguay") <<
         "es-VE" << tr("Spanish") + QString(" - ") + tr("Venezuela") <<
As for adding this to the user manual, the number of users who do not use the normal hunspell dictionaries is very very small. If people do run into difficulties they can of course come here to our User Forum on Mobileread to get help.

That said, if you would like to help edit the user-guide with additional information, we would be happy to consider it for inclusion.

Hope this helps!
KevinH is online now   Reply With Quote
Old 03-03-2021, 10:11 AM   #14
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
The choice of Primary Language dictionary does impact the SpellCheck dialog and real time spell checking (red squiggley) in general.

If you chose en_GB as your Primary dictionary, then everywhere you use just "en" as a language code or xml:lang attribute will map to that dictionary over the en_US one. But it has to be able to determine the language of a dictionary from the long standing dictionary naming convention.

In Code View the red sqiggley is determined completely by Primary and Secondary dictionaries chosen and it ignores any lang or xml:lang attributes as using lang and xml:lang is not an epub2 requirement if dc:language is set.

This is useful when only one language is used as specified in dc:language and no where else or only a few words are taken from a second language.

To support true multilanguage spell checking the use of xml:lang or lang attributes is required and what is recommended for epub3 / html5.
KevinH is online now   Reply With Quote
Old 03-03-2021, 10:20 AM   #15
ebray187
Member
ebray187 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Dec 2020
Device: epub
You explained it as an open book. Thanks!
ebray187 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Export list of words in spellcheck elchamaco Sigil 24 07-18-2019 10:22 AM
Spellcheck Ignore Words tetrault Sigil 4 02-11-2017 03:25 PM
Unable to use spellcheck dictionary for italicizing words sjhawar Sigil 18 10-20-2016 03:01 PM
What's the most words you've written in a month? John Carroll Writers' Corner 16 02-12-2013 06:15 AM
Dialog boxes do not show text =X= Nook Developer's Corner 5 01-03-2011 07:51 PM


All times are GMT -4. The time now is 12:16 PM.


MobileRead.com is a privately owned, operated and funded community.