08-06-2017, 09:13 AM | #1 |
Wizard
Posts: 1,071
Karma: 412718
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
Spell checking multiple languages
https://manual.calibre-ebook.com/edi...ds-in-the-book
The example in the above reference shows how to use lang=" " to mark a word as 'not book default' so that spell checker knows to us another dictionary (or so I assume it works that way): Code:
<div lang="en_US">color <span lang="en_GB">colour</span></div> I could go through and add then to my own dictionary, but that's not really spell checking. Is there anyway to select 2 or more languages for the entire book? Is there another way to do it other than one at a time? |
08-06-2017, 11:30 AM | #2 |
Wizard
Posts: 1,161
Karma: 1404241
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
|
There are different scenarios:
Sometimes you have books with explicit defined languages for a word, a paragraph, a file or book wide. You can add a additional language to the spellchecker. These will be used for all explicit signalized words, paragraphs and so on (e.g. like in your exampe). Usually there is only one language defined in a book. For this cases I use user defined dictionaries (mostly two: one for foreign language words and one for special word constructs used in the actual book). The user defined dictionaries can set to an active / non active status so that you have all freedom to use is like you want. The only thing what is missing is the possibility to deactivate one of the main dictionary in case you use more then one dictionary. I thought this was implemented years ago when the spellchecker was implemented but I can't find it again, maybe I remember wrong and had only ask for it For this I use a little trick to work first with the foreign language as false positive and copy all correct identified words in a new user dictionary for a foreign language and switch then back to to major language including the new user dictionary as additional dictionary. Take a look to the section Import word lists. This is very helpful to manage huge word lists. It have the possibility to add a language identifier to a list or a word too. You can copy containing words of a user dictionary to clipboard to create your own sets of useful combinations |
08-06-2017, 12:54 PM | #3 |
creator of calibre
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The correct way to do it is to mark text in different languages. That allows people reading the book in the future to, for example, lookup words in the dictionary of the marked language while reading.
However, if you want to automate just spell cehcking you can do so using a search replace function mode function (but you need to be able to program a bit for that). The idea would be similar to https://manual.calibre-ebook.com/fun...phenated-words Here when the word is not recognized by the main dictionary you wrap it ina <span lang="secondary language">word</span> Then re-run spell check. Now words reported misspelled will have failed to match in both languages. Fix all the words that need fixing and then when you are done, run a search and replace to remove the inserted span tags. |
08-06-2017, 06:46 PM | #4 | |
Wizard
Posts: 1,071
Karma: 412718
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
@DD -- thanks for possible workaround. I'll think on it
@kovid -- Agree that this is the right way to do it Quote:
but lot of manual effort, and I'm not up to writing my own search/replace RE function For a few foreign words, I could use [Insert Tag] Side note about possible User Manual problem with [Insert Tag] https://manual.calibre-ebook.com/edit.html I created a <span lang="fr"> tag, but it took a while to remember how I had created the tags I currently have. There doesn't seem to be any information in the link about creating / inserting a tag. Last edited by phossler; 08-06-2017 at 06:51 PM. |
|
08-07-2017, 03:39 AM | #5 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Just type it - tags are just text. The editor will automatically insert the close tag for you as soon as you type the open tag.
|
08-07-2017, 05:12 AM | #6 |
Wizard
Posts: 1,161
Karma: 1404241
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
|
This is correct and the problem is, that you can't automate this for languages in a perfect way. This is for my native language more or less the same situation. Depending on the complexity of the source text I use for this cases a other way around. I export/open the book text in MS-Word as DOCX and let it make the job of declaring the language part. You need first to add the needed spellchecker in word. It is not perfect but good enough to go forward. The tail of the coin is, that you loose the original document structure when you need to make a conversion but this is, compare to the time it takes to correct this part, only a minor item of the bill.
|
08-07-2017, 11:20 AM | #7 | |
Wizard
Posts: 1,071
Karma: 412718
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
Quote:
Also, I've found that the editor will only complete the closing tag when I type </ |
|
08-07-2017, 11:25 AM | #8 | |
Wizard
Posts: 1,071
Karma: 412718
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
Quote:
Since I spend most of my time 'fixing' an epub to be more readable on my kindle, it'd have to be a judgement call each time |
|
08-07-2017, 12:46 PM | #9 |
creator of calibre
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
@phossler: Not sure what you mean. You just lick the insert tag button and it asks you to input the tag you want. I dont know how it could be more straightforward than that.
|
08-07-2017, 07:19 PM | #10 | ||||
Wizard
Posts: 1,071
Karma: 412718
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
Yes, the whole process is very easy to pick a tag and then insert it around the around selected text (attachment)
When I said Quote:
e.g. After I insert the <b> .... Quote:
Quote:
Quote:
All very nice and user friendly |
||||
08-07-2017, 08:10 PM | #11 |
Well trained by Cats
Posts: 29,804
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
It would be wonderful if in the spell check, you could simply change the language column value and have it replace (all) with the appropriate span tag, even if you had to click the replace button after changing (forcing) the language.
|
08-07-2017, 11:07 PM | #12 |
creator of calibre
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
@theducks: I dont think wrapping span tags around single words is a good idea, that will create lots of markup bloat.
What is really needed is a language markup tool/plugin. You give it a list of dictionaries, then it goes through the book and matches words against all the dictionaries in the list. Every contiguous series of words matched to the same dictionary then get wrapped in span tag with the correct language. That is basically how the language detect tool in word works, I imagine. |
08-07-2017, 11:49 PM | #13 | |
Well trained by Cats
Posts: 29,804
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
|
|
08-08-2017, 07:17 AM | #14 |
Wizard
Posts: 1,071
Karma: 412718
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
@DD -- I am certainly not an expert, but it seems there are multiple reasons why someone would want to make css language-aware
http://www.w3.org/International/questions/qa-lang-why @KG - agree about the possible markup bloat, so the contagious word <span> would be the way to go. My example in #10 has 4 French words selected to be <span>-ed |
08-08-2017, 08:03 AM | #15 |
null operator (he/him)
Posts: 20,572
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Nuh - main reason is so the reader can lookup the word in an appropriate dictionary. You know the ones that give you the meaning of word, examples of use, and bit of etymology if you're lucky, and translation if it's that sort of dictionary.
A spell checker might know floccinaucinihilipilification is correctly spelt, but will the reader know what it means BR Last edited by BetterRed; 08-08-2017 at 08:51 AM. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
spell checking | brolny | Sigil | 1 | 09-18-2015 09:38 AM |
Spell checking content.opf | BetterRed | Editor | 3 | 02-13-2015 03:37 AM |
Multi-lingual spell checking | Stingo | Amazon Kindle | 6 | 11-19-2013 04:58 PM |
Spell checking epub files | len.jacobs | Sigil | 4 | 04-09-2010 04:51 PM |