Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 08-06-2017, 09:13 AM   #1
phossler
Guru
phossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensions
 
Posts: 729
Karma: 51686
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
Spell checking multiple languages

https://manual.calibre-ebook.com/edi...ds-in-the-book

The example in the above reference shows how to use lang=" " to mark a word as 'not book default' so that spell checker knows to us another dictionary (or so I assume it works that way):

Code:
<div lang="en_US">color <span lang="en_GB">colour</span></div>
I have a epub right now that has a large amount of foreign dialog and it will be very time consuming to mark all the passages

I could go through and add then to my own dictionary, but that's not really spell checking.

Is there anyway to select 2 or more languages for the entire book?

Is there another way to do it other than one at a time?
phossler is offline   Reply With Quote
Advert
Old 08-06-2017, 11:30 AM   #2
Divingduck
Guru
Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.
 
Posts: 882
Karma: 1216240
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
There are different scenarios:
Sometimes you have books with explicit defined languages for a word, a paragraph, a file or book wide.

You can add a additional language to the spellchecker. These will be used for all explicit signalized words, paragraphs and so on (e.g. like in your exampe).

Usually there is only one language defined in a book. For this cases I use user defined dictionaries (mostly two: one for foreign language words and one for special word constructs used in the actual book). The user defined dictionaries can set to an active / non active status so that you have all freedom to use is like you want.

The only thing what is missing is the possibility to deactivate one of the main dictionary in case you use more then one dictionary. I thought this was implemented years ago when the spellchecker was implemented but I can't find it again, maybe I remember wrong and had only ask for it

For this I use a little trick to work first with the foreign language as false positive and copy all correct identified words in a new user dictionary for a foreign language and switch then back to to major language including the new user dictionary as additional dictionary.
Take a look to the section Import word lists. This is very helpful to manage huge word lists. It have the possibility to add a language identifier to a list or a word too. You can copy containing words of a user dictionary to clipboard to create your own sets of useful combinations
Attached Thumbnails
Click image for larger version

Name:	calDict.JPG
Views:	24
Size:	184.9 KB
ID:	158287  
Divingduck is offline   Reply With Quote
Old 08-06-2017, 12:54 PM   #3
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 31,550
Karma: 8685410
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The correct way to do it is to mark text in different languages. That allows people reading the book in the future to, for example, lookup words in the dictionary of the marked language while reading.

However, if you want to automate just spell cehcking you can do so using a search replace function mode function (but you need to be able to program a bit for that).

The idea would be similar to https://manual.calibre-ebook.com/fun...phenated-words

Here when the word is not recognized by the main dictionary you wrap it ina <span lang="secondary language">word</span>

Then re-run spell check. Now words reported misspelled will have failed to match in both languages. Fix all the words that need fixing and then when you are done, run a search and replace to remove the inserted span tags.
kovidgoyal is online now   Reply With Quote
Old 08-06-2017, 06:46 PM   #4
phossler
Guru
phossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensions
 
Posts: 729
Karma: 51686
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
@DD -- thanks for possible workaround. I'll think on it

@kovid -- Agree that this is the right way to do it

Quote:
<p>I lived in the seventeenth <span lang="fr">arrondissement</span>. The modernization project that had swept up the<span lang="fr"> Avenue Neuilly</span> and was extending the smart side of Paris to the west had by-passed the dingy <span lang="fr">Quartier des Ternes</span>. I walked as far as the <span lang="fr">Avenue de la Grande Armee</span>. The Arc was astraddle the <span lang="fr">Etoile</span> and the traffic was desperate to get there. Thousands of red lights twinkled like bloodshot stars in the warm mist of the exhaust fumes. It was a fine Paris evening, Gauloises and garlic sat lightly on the air, ...</p>

but lot of manual effort, and I'm not up to writing my own search/replace RE function

For a few foreign words, I could use [Insert Tag]



Side note about possible User Manual problem with [Insert Tag]

https://manual.calibre-ebook.com/edit.html

I created a <span lang="fr"> tag, but it took a while to remember how I had created the tags I currently have. There doesn't seem to be any information in the link about creating / inserting a tag.
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	23
Size:	32.3 KB
ID:	158294  

Last edited by phossler; 08-06-2017 at 06:51 PM.
phossler is offline   Reply With Quote
Old 08-07-2017, 03:39 AM   #5
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 81,478
Karma: 75604729
Join Date: Nov 2006
Location: UK
Device: Kindle Voyage, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by phossler View Post
I created a <span lang="fr"> tag, but it took a while to remember how I had created the tags I currently have. There doesn't seem to be any information in the link about creating / inserting a tag.
Just type it - tags are just text. The editor will automatically insert the close tag for you as soon as you type the open tag.
HarryT is offline   Reply With Quote
Advert
Old 08-07-2017, 05:12 AM   #6
Divingduck
Guru
Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.
 
Posts: 882
Karma: 1216240
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
Quote:
Originally Posted by phossler View Post
but lot of manual effort
This is correct and the problem is, that you can't automate this for languages in a perfect way. This is for my native language more or less the same situation. Depending on the complexity of the source text I use for this cases a other way around. I export/open the book text in MS-Word as DOCX and let it make the job of declaring the language part. You need first to add the needed spellchecker in word. It is not perfect but good enough to go forward. The tail of the coin is, that you loose the original document structure when you need to make a conversion but this is, compare to the time it takes to correct this part, only a minor item of the bill.
Divingduck is offline   Reply With Quote
Old 08-07-2017, 11:20 AM   #7
phossler
Guru
phossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensions
 
Posts: 729
Karma: 51686
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
Quote:
Originally Posted by HarryT View Post
Just type it - tags are just text. The editor will automatically insert the close tag for you as soon as you type the open tag.
I find it easier to selected the foreign text and then just click [Insert Tag] to bracket the text with <span lang="fr"> ..... </span>

Also, I've found that the editor will only complete the closing tag when I type </
phossler is offline   Reply With Quote
Old 08-07-2017, 11:25 AM   #8
phossler
Guru
phossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensions
 
Posts: 729
Karma: 51686
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
Quote:
Originally Posted by Divingduck View Post
This is correct and the problem is, that you can't automate this for languages in a perfect way.
Interesting approach, and I think that'd work when I start with very raw text.

Since I spend most of my time 'fixing' an epub to be more readable on my kindle, it'd have to be a judgement call each time
phossler is offline   Reply With Quote
Old 08-07-2017, 12:46 PM   #9
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 31,550
Karma: 8685410
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
@phossler: Not sure what you mean. You just lick the insert tag button and it asks you to input the tag you want. I dont know how it could be more straightforward than that.
kovidgoyal is online now   Reply With Quote
Old 08-07-2017, 07:19 PM   #10
phossler
Guru
phossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensions
 
Posts: 729
Karma: 51686
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
Yes, the whole process is very easy to pick a tag and then insert it around the around selected text (attachment)

When I said

Quote:
Also, I've found that the editor will only complete the closing tag when I type </
I was referring to manually inserting a tag that Calibre auto-closes for me when I tell it where the scope ends

e.g. After I insert the <b> ....

Quote:
text text text <b> text text text text text text
.... when I add the 'closing tag' start characters </ ....

Quote:
text text text <b> text text text text text text</
.... Calibre auto completes it for me

Quote:
text text text <b> text text text text text text</b>

All very nice and user friendly
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	17
Size:	61.3 KB
ID:	158309  
phossler is offline   Reply With Quote
Old 08-07-2017, 08:10 PM   #11
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 20,881
Karma: 20350468
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: K4NT, Galaxy Tab 2(RIP)
It would be wonderful if in the spell check, you could simply change the language column value and have it replace (all) with the appropriate span tag, even if you had to click the replace button after changing (forcing) the language.
theducks is offline   Reply With Quote
Old 08-07-2017, 11:07 PM   #12
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 31,550
Karma: 8685410
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
@theducks: I dont think wrapping span tags around single words is a good idea, that will create lots of markup bloat.

What is really needed is a language markup tool/plugin. You give it a list of dictionaries, then it goes through the book and matches words against all the dictionaries in the list. Every contiguous series of words matched to the same dictionary then get wrapped in span tag with the correct language.

That is basically how the language detect tool in word works, I imagine.
kovidgoyal is online now   Reply With Quote
Old 08-07-2017, 11:49 PM   #13
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 20,881
Karma: 20350468
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: K4NT, Galaxy Tab 2(RIP)
Quote:
Originally Posted by kovidgoyal View Post
@theducks: I dont think wrapping span tags around single words is a good idea, that will create lots of markup bloat.

What is really needed is a language markup tool/plugin. You give it a list of dictionaries, then it goes through the book and matches words against all the dictionaries in the list. Every contiguous series of words matched to the same dictionary then get wrapped in span tag with the correct language.

That is basically how the language detect tool in word works, I imagine.
I guess the only reason for the markup, is spell checking. A number of 'other' language words are in common usage in American literature (many have to do with eating ) and the spell check flags them as wrong in en-US...but they may also be spelled wrong in their original language, but without language sensitivity...
theducks is offline   Reply With Quote
Old 08-08-2017, 07:17 AM   #14
phossler
Guru
phossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensions
 
Posts: 729
Karma: 51686
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
@DD -- I am certainly not an expert, but it seems there are multiple reasons why someone would want to make css language-aware

http://www.w3.org/International/questions/qa-lang-why

@KG - agree about the possible markup bloat, so the contagious word <span> would be the way to go. My example in #10 has 4 French words selected to be <span>-ed
phossler is offline   Reply With Quote
Old 08-08-2017, 08:03 AM   #15
BetterRed
null operator
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 9,094
Karma: 7214933
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by theducks View Post
I guess the only reason for the markup, is spell checking.
Nuh - main reason is so the reader can lookup the word in an appropriate dictionary. You know the ones that give you the meaning of word, examples of use, and bit of etymology if you're lucky, and translation if it's that sort of dictionary.

A spell checker might know floccinaucinihilipilification is correctly spelt, but will the reader know what it means

BR

Last edited by BetterRed; 08-08-2017 at 08:51 AM.
BetterRed is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
spell checking brolny Sigil 1 09-18-2015 09:38 AM
Spell checking content.opf BetterRed Editor 3 02-13-2015 03:37 AM
Multi-lingual spell checking Stingo Amazon Kindle 6 11-19-2013 04:58 PM
Spell checking epub files len.jacobs Sigil 4 04-09-2010 04:51 PM


All times are GMT -4. The time now is 05:15 AM.


MobileRead.com is a privately owned, operated and funded community.