Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 04-18-2014, 09:08 AM   #16
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,778
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by kovidgoyal View Post
@Divingduck: Attach a file where you cannot jump to the next occurrence.

As for importing wordlists as user dictionaries, I can certainly implement that. However, in calibre a user dictionary contains not just a list of words but every word also has an associated language. Therefore, they cannot be exported as simple word lists without losing some information.

I dont understand why you need to turn off dictionaries. The spell check is perfectly capable of handling multilingual documents based on the language specified in the lang HTML attribute.
I 'correct' scanned Fantasy and Science Fiction. I sometimes create Universe specific dictionaries that do not belong elsewhere.

Having the ability to invoke a specific set of dictionaries is a plus.
theducks is offline   Reply With Quote
Old 04-18-2014, 09:28 AM   #17
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,842
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
@DrChiper: The spell check most definitely is aware of tags. In fact it will only check text in tags (other than script and style) and in the attributes alt and title (which are used for tooltips/descriptions). If you are getting words from places other than tags, then it is likely a bug, attach the file.

@theducks: That's what the user dictionaries are for.

@BR: I suggest setting up a user dictionary with your abbreviations. The correct solution for not flagging some words is not to simply ignore an entire file.
kovidgoyal is offline   Reply With Quote
Advert
Old 04-18-2014, 10:11 AM   #18
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Hi

This was the last big item left to do. How amazing this huge task has been completed in so few months! Congratulations and thank you.


French

Here is a link for a good French dictionnary in oxt format. Of course it's hunspell all along. It's used by Grammalecte, a grammatical extension to OpenOffice and LibreOffice. It is updated quite frequently. The 5.0.2 version is from January 2014. It works pretty well with the Editor.
http://www.dicollecte.org/grammalecte/telecharger.php

Elided forms

Take a word: articulateur. You also can find: l'articulateur. The editor will display it two times, in two separate places once as "articulateur", the other as "l'articulateur" because elided forms seem to be counted as independent words (same for d' and others). It is grammatically the same word.

GUI

Even on a full page window, I can, at most visualize 20 words. I should at least be able to see a healthy 50 and it seems really necessary to be able to narrow the fields.

Several languages together

What exactly should I write and where if I have a book in French with lots of English words inside? I start from here.
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr-FR">

Last edited by roger64; 04-18-2014 at 12:48 PM.
roger64 is offline   Reply With Quote
Old 04-18-2014, 10:56 AM   #19
mrmikel
Color me gone
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
I have been on the lookout for a British-american dictionary, since I work with a lot of British works and also american or works about WWII where both are used at the same time.
mrmikel is offline   Reply With Quote
Old 04-18-2014, 10:59 AM   #20
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,842
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
@roger64: Regarding treatment of elided forms, as I am not a french speaker, it is rather difficult for me to write rules for combining words. At the moment all distinctly spelled words are treated differently. Patches are welcome to change that.

If you want multiple languages, you do something like this:

<div lang="en">Some english text<span lang="fr">some french words</span></div>

Basically when the lang attribute exists on a tag, the contents of that tag are interpreted in the specified language. You can put a lang attribute on any tag, and they can be nested.
kovidgoyal is offline   Reply With Quote
Advert
Old 04-18-2014, 11:23 AM   #21
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Thanks

I will get in touch with the Grammalecte author about elided forms.
roger64 is offline   Reply With Quote
Old 04-18-2014, 11:27 AM   #22
Divingduck
Wizard
Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.
 
Posts: 1,161
Karma: 1404241
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
Quote:
Originally Posted by kovidgoyal View Post
As for importing wordlists as user dictionaries, I can certainly implement that. However, in calibre a user dictionary contains not just a list of words but every word also has an associated language. Therefore, they cannot be exported as simple word lists without losing some information.
I saw the prefs.json and understand what you mean. Maybe you can add in an import dialog a possibility for choosing a language of a user dictionary. Then there is the flexibility using your optimized setting and in addition with an import functionality. For export it maybe a good idea to add this in a default filename (e.g. user_de.txt)

Quote:
I dont understand why you need to turn off dictionaries. The spell check is perfectly capable of handling multilingual documents based on the language specified in the lang HTML attribute.
Well, I try to explain it on a workflow. Germans read many translated books. These book are often using phrases or words what might be spelled correct and sometimes not. Sometimes they use colloquial words or an author makes a decision to write a word in a special way. This makes it sometimes a bit difficult to understand whether a word was spell correct or not. (Especially if a check uses a mixed language database)
I often digitize books what use these kinds of mixed things. For this my best workflow is to use only the native spelling language and in addition one specialized (user) dictionary build up only for this particular book what apply to all other used words. Sometimes I use a second one for book series what apply only to special words used in a particular series.

Anyway, I can stay as well with the current solution as I saw there is a work around possible by deleting the standard dictionaries.
Divingduck is offline   Reply With Quote
Old 04-18-2014, 11:37 AM   #23
Divingduck
Wizard
Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.
 
Posts: 1,161
Karma: 1404241
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
Quote:
Originally Posted by kovidgoyal View Post
@roger64:
<div lang="en">Some english text<span lang="fr">some french words</span></div>
Basically when the lang attribute exists on a tag, the contents of that tag are interpreted in the specified language. You can put a lang attribute on any tag, and they can be nested.
Interesting and good to know for future projects. Please put this hint as a side note in your help file too.
Divingduck is offline   Reply With Quote
Old 04-18-2014, 11:59 AM   #24
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,842
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by Divingduck View Post
Interesting and good to know for future projects. Please put this hint as a side note in your help file too.
It is in the manual, was it not clear?

Quote:
Words are shown with the number of times they occur in the book and the
language the word belongs to. Language information is taken from the books
metadata and from ``lang`` attributes in the HTML files. This allows the spell
checker to work well even with books that contain text in multiple languages.
kovidgoyal is offline   Reply With Quote
Old 04-18-2014, 12:00 PM   #25
mrmikel
Color me gone
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
An example might be helpful to those of us who are not so knowledgeable. It is not so common a construction. Your previous cited example was perfect in that regard.
mrmikel is offline   Reply With Quote
Old 04-18-2014, 12:04 PM   #26
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,842
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I have added an example.

Regarding the show next occurrence not working, I have found a bug that was causing it. Unfortunately, fixing it is not as simple as I hoped, I need to think about it a little.
kovidgoyal is offline   Reply With Quote
Old 04-18-2014, 12:21 PM   #27
mrmikel
Color me gone
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
As it happens, if you press control c when on the misspelled word, it does copy, and can be pasted to the find area of search and located that way, so it is not critical.
mrmikel is offline   Reply With Quote
Old 04-18-2014, 12:30 PM   #28
DrChiper
Bookish
DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.
 
DrChiper's Avatar
 
Posts: 907
Karma: 1803094
Join Date: Jun 2011
Device: PC, t1, t2, t3, aura 2 v1, clara HD, Libra 2, Nxtpaper 11
Interestingly, sometimes the same assumed "typo" is only caught once, like in this example:
Code:
            <div class="images">
               <img src="images/drawing1.jpg" alt="Image drawing1.jpg: Illustration: Triangles (Fig. 1 - 3)" class="calibre21"/>
               <p class="image-caption">Fig. 1, 2 and 3.</p>
            </div>
in which the first occurring "drawing1" is caught, but the 2nd not (at least it is not counted in the list). It looks like the preceding character might have something to do with this behavior.
DrChiper is offline   Reply With Quote
Old 04-18-2014, 12:31 PM   #29
Divingduck
Wizard
Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.
 
Posts: 1,161
Karma: 1404241
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
Kovid,
Yes, your example was a kind eye opener for me. I didn’t saw this way of switching a language within a text any time before.

mrmikel,
Thanks for your explanations to Kovid. It was exactly what I thought.
Divingduck is offline   Reply With Quote
Old 04-18-2014, 12:42 PM   #30
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,842
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
@mrmikel: Turns out, it was easy to fix after all: https://github.com/kovidgoyal/calibr...a6ef8453c57139

@DrChiper: calibre does not check the src attribute, since spelling is meaningless htere. The title and alt attributes are checked, however.
kovidgoyal is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Spelling anomalies DMB General Discussions 71 06-19-2012 07:55 AM
Are DRM books with check in/check out allowed? i8abug Library Management 4 05-31-2012 02:27 PM
Spelling errors and such starrlamia General Discussions 29 11-29-2010 03:59 AM
Seriously thoughtful Spelling contractions SameOldStory Lounge 47 09-08-2010 09:08 PM
Spelling Macro PieOPah Workshop 36 12-13-2008 02:27 AM


All times are GMT -4. The time now is 02:31 AM.


MobileRead.com is a privately owned, operated and funded community.