Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 07-09-2014, 05:19 AM   #16
BobC
Guru
BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.
 
Posts: 691
Karma: 3026110
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro +
Quote:
Originally Posted by BetterRed View Post

Be nice to have the ability to exclude paragraphs too - to avoid checking quotes in the original vernacular - eg Chaucer, Shakespeare etc

BR
Just thinking laterally on this - if you consider the paragraphs as being in a "foreign" language for which there is no dictionary would this do the trick ?

So you would need to set the paragraph to have a style where the language code was say "en-oe" (for Old English). The trouble here would be that doing may be more effort than the benefit. It would mean wrapping the portion of text in a <div lang="en-oe"> or using CSS styling to achieve the same effect.

If Calibre interpreted "en-oe" as English it might need a dummy language code.

HOWEVER a quick check seems to indicate that the Editor will treat any declared language code for which it doesn't have a dictionary as that of the main file declaration or in the absence of a language declaration in the header as English (at least on my machine).

So a workaround would appear to be to install a dictionary with no words in common with English (Klingon ?) and declare the undesired paragraphs as that language.

Of course if there was actually a "en-oe" dictionary that would be great.

EDIT - given the context Middle English might have been a better suggestion, however it's the general principle I was trying to illustrate.

BobC

Last edited by BobC; 07-09-2014 at 07:28 AM.
BobC is offline   Reply With Quote
Old 07-09-2014, 09:50 AM   #17
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,003
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by BobC View Post
Just thinking laterally on this - if you consider the paragraphs as being in a "foreign" language for which there is no dictionary would this do the trick ?

So you would need to set the paragraph to have a style where the language code was say "en-oe" (for Old English). The trouble here would be that doing may be more effort than the benefit. It would mean wrapping the portion of text in a <div lang="en-oe"> or using CSS styling to achieve the same effect.

If Calibre interpreted "en-oe" as English it might need a dummy language code.

HOWEVER a quick check seems to indicate that the Editor will treat any declared language code for which it doesn't have a dictionary as that of the main file declaration or in the absence of a language declaration in the header as English (at least on my machine).

So a workaround would appear to be to install a dictionary with no words in common with English (Klingon ?) and declare the undesired paragraphs as that language.

Of course if there was actually a "en-oe" dictionary that would be great.

EDIT - given the context Middle English might have been a better suggestion, however it's the general principle I was trying to illustrate.

BobC
Thanks Bob, I wondered about using a language too, but having to create a pseudo dictionary means it's contrived solution - on principle I don't like using contrived solutions to workaround trivial annoyances. IMO Tex's "calibre_ignore_spellcheck" class would be more elegant - but...

BR
BetterRed is offline   Reply With Quote
Old 07-09-2014, 11:17 AM   #18
mrmikel
Color me gone
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
Aren't quotes likely to be formatted differently, so it would be pretty easy to ignore them?
mrmikel is offline   Reply With Quote
Old 07-09-2014, 02:10 PM   #19
BobC
Guru
BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.
 
Posts: 691
Karma: 3026110
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro +
Quote:
Originally Posted by BetterRed View Post
Thanks Bob, I wondered about using a language too, but having to create a pseudo dictionary means it's contrived solution - on principle I don't like using contrived solutions to workaround trivial annoyances. IMO Tex's "calibre_ignore_spellcheck" class would be more elegant - but...

BR
Well FYI that doesn't work. I butchered a simple oxt dictionary to have a language code of "none" but Calibre was too clever - it wouldn't import the dictionary, complaining that it had an invalid language code.

(the oxt imported into Libre Office without a complaint)

BobC
BobC is offline   Reply With Quote
Old 07-09-2014, 03:57 PM   #20
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by mrmikel View Post
Aren't quotes likely to be formatted differently, so it would be pretty easy to ignore them?
Not necessarily, they could be inline quotes.
Tex2002ans is offline   Reply With Quote
Old 07-09-2014, 07:12 PM   #21
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,003
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by mrmikel View Post
Aren't quotes likely to be formatted differently,
Possibly, and if not then I guess one could make it so

Quote:
Originally Posted by mrmikel View Post
so it would be pretty easy to ignore them?
But I'm not the problem, I have no problem ignoring the text no matter what the format The problem is how to have the spell checker ignore text with a specific format, or what ever?

As I've said, for me this issue falls into the nice-to-have/trivial-annoyance/wishful-thinkiing/in-your-dreams-BR... category. I never would have thought it would attract so much attention

BR
BetterRed is offline   Reply With Quote
Old 07-09-2014, 08:31 PM   #22
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,003
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by BobC View Post
Well FYI that doesn't work. I butchered a simple oxt dictionary to have a language code of "none" but Calibre was too clever - it wouldn't import the dictionary, complaining that it had an invalid language code.

(the oxt imported into Libre Office without a complaint)

BobC
So, why not use a dictionary for a real language one is never likely to really need? I just installed the Estonian oxt. No disrespect to Estonians - it was a rational decision, I chose it because its uses Latin characters but its not Indo-European.

How do I mark a paragraph (or define a class) as using Estonian (it's code is 'et' ). I looked at W3Schools, but as usually happens there, my eyes glazed over. ie what would I need to add to this to flag it as Estonian

Code:
.block10 {
    display: block;
    font-size: 0.75em;
    text-align: justify;
    text-indent: 18pt;
    padding: 0;
    margin: 0
    }
I really shouldn't be wasting my time or anyone else's on this, but...

BR

Last edited by BetterRed; 07-10-2014 at 08:24 AM.
BetterRed is offline   Reply With Quote
Old 07-09-2014, 10:22 PM   #23
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,597
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
If you want to change the language of a tag inHTML you have to add the lang attribute, like this

<p lang="fr">

which changes the language to French.
kovidgoyal is offline   Reply With Quote
Old 07-09-2014, 10:47 PM   #24
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,003
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Perfect - see attachment

And I don't need to install an Estonian dictionary to use the language tag, seemingly it just needs to be a valid one.

Kovid & BobC.

BR
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	215
Size:	39.1 KB
ID:	125167  
BetterRed is offline   Reply With Quote
Old 07-10-2014, 12:52 AM   #25
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by BetterRed View Post
So, why not use a dictionary for a real language one is never likely to really need? I just installed the Estonian oxt. No disrespect to Estonians - it was a rational decision, I chose it because its uses Latin characters but its not Indo-European.
Well, I wouldn't go marking text as a given language that it isn't, this is just going to cause more problems than you are trying to solve.

I was doing some back/forth help from Jellby with transcribing Greek letters, and he convinced me to start marking up Greek correctly. Here is how I handle it now:

Quote:
<p>[...] The division of labor turns the self-sufficient individual into the <span class="greek" xml:lang="grc">ζῷον πολιτικόν</span> dependent on his fellow men, the social animal of which Aristotle spoke. Hostilities between one animal and another, or between one savage and another, in no way alter the economic basis of their existence. [...]</p>
Also, this site gives some of the reasons why you would want to tag languages correctly:

http://www.unimelb.edu.au/accessibil.../language.html

Quote:
Language information specified via the lang attribute may be used by a user agent to control rendering in a variety of ways. Some situations where author-supplied language information may be helpful include:
  • Assisting search engines
  • Assisting speech synthesizers
  • Helping a user agent select glyph variants for high quality typography
  • Helping a user agent choose a set of quotation marks
  • Helping a user agent make decisions about hyphenation, ligatures, and spacing
  • Assisting spell checkers and grammar checkers
While there isn't A TON of benefit from marking it up now, a lot of the reasoning for marking up languages so in-depth is to future-proof the HTML.

"Assisting speech synthesizers" is extremely helpful with Text->Audio programs.

As to the typography side of things, now that I have stumbled into the world of LaTeX, having the languages marked properly allows the hyphenation dictionaries to work, which is (REALLY) important. And for languages with completely foreign character sets like Chinese or Greek, it allows you to easily swap in a different font.

There is also fantastic functionality built into LaTeX which easily allows you to swap between different rulesets (what quotation marks should be used, spacing rules around quotations, where linebreaks are allowed, etc. etc.).

Who knows, maybe ereaders in the future would be able to do more fancy stuff like that too with properly marked-up text.

Now, in a perfect world, you would mark every little saying as French, German, Spanish, etc. etc.... but that just takes way too long (the marginal benefit is not worth it to me), so I just settle on doing it for Greek.

Priority #1 is to get the dang books digitized and up online... way lower priority can be to go back and add in the language markup as needed. (Or when I get around to LaTeXing the books).

Last edited by Tex2002ans; 07-10-2014 at 01:05 AM.
Tex2002ans is offline   Reply With Quote
Old 07-10-2014, 04:32 AM   #26
BobC
Guru
BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.
 
Posts: 691
Karma: 3026110
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro +
Like Tex I don't like faking the language code which is one reason I tried to come up with an empty dictionary approach. However as you can use any legal language code I think I'm going to use the one for Ido (io) - a little used variant of Esperanto and for which it is unlikely there is a dictionary ! (or much literature in the language)

Libre Office gets round this by having a setting under "Language" of "None" (Ignore spelling) but doesn't have a setting for Ido .

BobC
BobC is offline   Reply With Quote
Old 07-10-2014, 05:20 AM   #27
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,597
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Use lang="und" which is the ISO 639-3 code for undefined and will prevent the spell checker from operating.
kovidgoyal is offline   Reply With Quote
Old 07-10-2014, 06:07 AM   #28
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,003
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
@Tex2002ans & BobC both said, 'not a good idea to use fake language code.' You won't get any arguments from me on that. As I said before I don't like contrivances.

That's what lead me to seek a language-in-class solution - use-it whilst spell checking, then lose-it

If/when I ever use it in the HTML it'll be transient. I was straightening out a 19th legal text recently that had a boatload of lengthy M.E. quotations, that's the sort of occasion when I might use a 'fake language'.

Interesting exchange.

I'd been wondering if the Tex2002ans handle was a 'nod' to TeX, seemingly not. Maybe TPTB would allow you to upgrade the x to an X, then it could be so :lol:

Added : Kovid just flew in and dropped a nugget of knowledge

BR

Last edited by BetterRed; 07-10-2014 at 06:10 AM.
BetterRed is offline   Reply With Quote
Old 07-10-2014, 07:34 AM   #29
BobC
Guru
BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.
 
Posts: 691
Karma: 3026110
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro +
Quote:
Originally Posted by kovidgoyal View Post
Use lang="und" which is the ISO 639-3 code for undefined and will prevent the spell checker from operating.
That's what I had been looking for but I couldn't find the code. I was looking for something like "none" or "null".

Armed with that I've also found that :

ang - Old English and enm - Middle English are probably what is needed for the Chaucer/Shakespeare stuff and do the job.

Thanks Kovid for the pointer.
BobC is offline   Reply With Quote
Old 07-10-2014, 08:22 AM   #30
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,003
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by BobC View Post
That's what I had been looking for but I couldn't find the code. I was looking for something like "none" or "null".

Armed with that I've also found that :

ang - Old English and enm - Middle English are probably what is needed for the Chaucer/Shakespeare stuff and do the job.

Thanks Kovid for the pointer.
Because I'll only use it when the olde worlde text is a nuisance, I think I'll stick with 'und' and delete it when I'm done spell checking.

Added - It would seem that Sigil's spell checker does not ignore xml:lang="und"

BR

Last edited by BetterRed; 07-10-2014 at 08:38 AM.
BetterRed is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Spell Check Suggestion Tex2002ans Sigil 19 01-10-2013 08:45 PM
Spell Check GeckoFriend Sigil 5 06-15-2012 03:09 PM
how to use spell check richreads Sigil 2 01-24-2012 10:13 PM
Disable spell check? mariel9898 Nook Developer's Corner 0 03-26-2011 09:49 AM
Enhancement suggestion. moggie Calibre 1 01-01-2009 01:35 PM


All times are GMT -4. The time now is 03:40 AM.


MobileRead.com is a privately owned, operated and funded community.