Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 08-11-2014, 02:46 AM   #1
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
About language and spelling

Hi

I began to work on an French OCR file to produce an EPUB. I then realized all the html files had a xml declaration including the wrong language (English) . I modified all the files to "fr-FR" - see below - and saved.

However, when I later checked the spelling (using a French dictionary), I saw that a lot of words were still mistakenly considered as English. (see screenshot).

It maybe a bug or I did a mistake when changing the language?

Code:
<?xml version='1.0' encoding='utf-8'?>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr-FR">
Attached Thumbnails
Click image for larger version

Name:	orthographe.png
Views:	346
Size:	83.1 KB
ID:	126682  

Last edited by roger64; 08-11-2014 at 02:48 AM.
roger64 is offline   Reply With Quote
Old 08-11-2014, 02:43 PM   #2
BobC
Guru
BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.
 
Posts: 691
Karma: 3026110
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro +
It might be useful to have a look at this recent thread :

https://www.mobileread.com/forums/sho...d.php?t=243887

Try declaring the language in the content.opf file and bear in mind that Kovid fixed a bug to do with this only last week.

BobC
BobC is offline   Reply With Quote
Advert
Old 08-11-2014, 03:12 PM   #3
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
@Bob

Thanks for your help.

I have read this comment from Kovid Goyal "The language from content.opf will work unless the html files define their own languages, in which case the latter will win."

For what I can see, we need to specify the language for both: not only for the html files but also for the content.opf file. Now, all the words in my EPUB are qualified as French.

EDIT: using the latest Calibre version
roger64 is offline   Reply With Quote
Old 08-11-2014, 04:59 PM   #4
BobC
Guru
BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.
 
Posts: 691
Karma: 3026110
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro +
@Roger

The other point that occurs is that you have :

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr-FR">

what happens if you use :

<html xmlns="http://www.w3.org/1999/xhtml" lang="fr">

Of course I don't know if your files are xhtml or just html (or even whether it would make a difference)

BobC
BobC is offline   Reply With Quote
Old 08-11-2014, 11:38 PM   #5
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,907
Karma: 47303748
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
From the other thread, it chooses the language using the most specific settings. So, in order it is:
  1. <tag lang="some_language">
  2. <html lang="some_language">
  3. opf <dc:language>some_language</dc:language>
  4. editor-preferred-language "some_language"

I tested it with a book in English that had some dialog in Spanish (a character tended to swear in Spanish) and Hebrew (different character, same reason). When I wrapped spans around the dialog, or added a language attribute to a paragraph, the spelling errors went away. Neat, but a lot of work.

I also tried the opf and file level setting. I used "en-US" in the OPF and "en-GB" in the file and "colour" was correctly spelt and "color" incorrect.

Something I had forgotten to try last week was if the file was less specific than the OPF. So, I have just tried "en-GB" and "en" in the file. With this combination, whatever is the editor specified default is used. So choosing US in the editor showed "colour" as incorrect and "color" as correct.

For the attributes on the html tag, the epub I used started with both. Either seem to work but the non-xml one seemed to have precedence.

One other thing I just noticed, if I change the language in the OPF, I have to save and reopen the file for it to take affect. Changing any of the others has an affect in the current session.
davidfor is offline   Reply With Quote
Advert
Old 08-12-2014, 02:00 AM   #6
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Thanks for this useful information
roger64 is offline   Reply With Quote
Old 08-26-2014, 12:22 PM   #7
Leonatus
Wizard
Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.
 
Leonatus's Avatar
 
Posts: 1,023
Karma: 10963125
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
Hm,
I have
- html <xml lang="de">
- opf <dc:language>de</dc:language>
There is no special tag language information, and I don't know how to find the "editor-prefered language" (maybe due to the translation).

When I click "tools-spellcheck", there appears the word list of the German dictionary that I have installed and marked as active (the default unchecked as active). But in the "live spellcheck" in code view, almost every german word is marked as erroneous, so it must be the default dictionary that is working here. How can I change the dictionary for the "live spellcheck", please?

Problem resolved: The Hyphenation tool had been activated!

Last edited by Leonatus; 08-26-2014 at 12:48 PM. Reason: Problem resolved
Leonatus is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Controlling spelling language BobC Editor 9 08-06-2014 12:45 AM
check spelling Divingduck Editor 99 05-13-2014 12:26 AM
Spelling anomalies DMB General Discussions 71 06-19-2012 07:55 AM
Seriously thoughtful Spelling contractions SameOldStory Lounge 47 09-08-2010 09:08 PM
Spelling Macro PieOPah Workshop 36 12-13-2008 02:27 AM


All times are GMT -4. The time now is 04:43 AM.


MobileRead.com is a privately owned, operated and funded community.