Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > KOReader

Notices

Reply
 
Thread Tools Search this Thread
Old 10-23-2020, 01:37 PM   #1
mzel
Enthusiast
mzel began at the beginning.
 
Posts: 42
Karma: 10
Join Date: Apr 2016
Device: Kobo Forma
Question about dictionaries

Is the support of morphology (genders, conjugations, contractions, etc.) the function of a dictionary or the koreader application?

It is not so much of a problem for English, but it is for many other languages
mzel is offline   Reply With Quote
Old 10-29-2020, 06:09 PM   #2
mergen3107
Margins dominator
mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.
 
mergen3107's Avatar
 
Posts: 229
Karma: 2332084
Join Date: Feb 2012
Location: Cape Canaveral
Device: PW3 (5.9.7 JB)
AFAIK, Stardict (which engine is used in KOReader) does not support morphology. However it has fuzzy search, which looks for the most similar looking word if no exact match found. Sometimes it works, sometimes doesn’t. In the latter case I just tap and hold the word title in the dict popup in KOReade and type my required word manually.
mergen3107 is offline   Reply With Quote
Advert
Old 10-30-2020, 04:25 PM   #3
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,059
Karma: 17116635
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by mergen3107 View Post
AFAIK, Stardict (which engine is used in KOReader) does not support morphology.
StarDict does support inflections. The KOReader StarDict engine does not.

EDIT: I was wrong, KOReader has been supporting .syn files since late 2019.

Last edited by Doitsu; 11-02-2020 at 12:41 PM.
Doitsu is offline   Reply With Quote
Old 11-01-2020, 02:09 PM   #4
mzel
Enthusiast
mzel began at the beginning.
 
Posts: 42
Karma: 10
Join Date: Apr 2016
Device: Kobo Forma
I guess you are talking about the .syn (synonyms) file. It is not really inflection. It requires you to list all the possible forms of the word as opposed to the list of rules of the language.
That means that if there is ~100 forms of the verb in Italian you need to provide 100 forms for each of the verbs as opposed to 100 rules for all the correct verbs combined
mzel is offline   Reply With Quote
Old 11-01-2020, 02:37 PM   #5
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,059
Karma: 17116635
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by mzel View Post
I guess you are talking about the .syn (synonyms) file.
I was indeed referring to .syn files, which the KOReader StarDict engine doesn't support.

Quote:
Originally Posted by mzel View Post
That means that if there is ~100 forms of the verb in Italian you need to provide 100 forms for each of the verbs as opposed to 100 rules for all the correct verbs combined
AFAIK, there are no cross-platform Open Source dictionary engines that support defining POS-based morphology rules.
Having to define all forms for each entry may seem like a rather primitive method, but it works surprisingly well.

BTW, If you want to add inflections to your own StarDict dictionary, you might find Tvangeste's inflection word lists for English, French, Italian, German, Spanish, Portuguese, Polish and Russian helpful.

Last edited by Doitsu; 11-02-2020 at 12:41 PM.
Doitsu is offline   Reply With Quote
Advert
Old 11-01-2020, 07:36 PM   #6
NiLuJe
BLAM!
NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.
 
NiLuJe's Avatar
 
Posts: 11,028
Karma: 19000122
Join Date: Jun 2010
Location: Paris, France
Device: Kindle 2i, 3g, 4, 5w, PW & PW2; Kobo H2O & Forma
Doesn't it? I recall a host of issues about sdcv being *slow* when dealing with synonyms, but handling them nonetheless .
NiLuJe is offline   Reply With Quote
Old 11-02-2020, 10:02 AM   #7
Galunid
Connoisseur
Galunid is faster than slow light.Galunid is faster than slow light.Galunid is faster than slow light.Galunid is faster than slow light.Galunid is faster than slow light.Galunid is faster than slow light.Galunid is faster than slow light.Galunid is faster than slow light.Galunid is faster than slow light.Galunid is faster than slow light.Galunid is faster than slow light.
 
Posts: 69
Karma: 29024
Join Date: Apr 2016
Device: KPW3, Kobo Clara HD, Onyx Boox Nova 2
Yup, it should support it, at least according to the issue @NiLuJe mentioned: https://github.com/koreader/koreader/issues/5437
Galunid is offline   Reply With Quote
Old 11-02-2020, 12:40 PM   #8
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,059
Karma: 17116635
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by NiLuJe View Post
Doesn't it? I recall a host of issues about sdcv being *slow* when dealing with synonyms, but handling them nonetheless .
You are of course right. Apparently, KOReader has been supporting .syn files since 2019.

(I updated my initial post.)
Doitsu is offline   Reply With Quote
Old 11-02-2020, 11:44 PM   #9
mzel
Enthusiast
mzel began at the beginning.
 
Posts: 42
Karma: 10
Join Date: Apr 2016
Device: Kobo Forma
Reading up on this forum and 3-4 others I came to the conclusion that my options are:
1) Generating .syn file out of the grammar rules from the link above or the .aff file from a Goldendict dictionary
2) Trying to build a command line Goldendict for Kobo and write a plugin for it in koreader
3) Implementing those same rules in .lua wrapper around sdcv and trying to find a closest match from koreader
4) finding a ready-made .syn file for the language - Italian in this case
5) Something else? Kindle was able to do a better job in this department. I mean the native Kindle reader with dictionaries built for it. The Italian-English dictionary was pretty good in this regard. The Italian-Russian was not perfect, but still better than what we have now under koreader. It uses the same initial vocabulary but handles inflections way better. I never tried to install the dictionaries under koreader on kindle
6) Forego all of the above and use manual entry plus a guesswork to arrive at the correct headword

All suggestions and comments are welcome

Last edited by mzel; 11-03-2020 at 12:11 PM.
mzel is offline   Reply With Quote
Old 11-03-2020, 03:17 AM   #10
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,059
Karma: 17116635
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by mzel View Post
4) finding a ready-made .syn file for the language - Italian in this case
AFAIK, .syn files are dictionary-specific index files that can't be re-used. They're automatically generated by StarDict Editor when you compile a Babylon GLS source file.
Quote:
Originally Posted by mzel View Post
5) Something else? Kindle was able to do a better job in this department.
You could unpack one of the free bilingual Oxford dictionaries that Amazon offers as optional downloads for eInk Kindle users with KindleUnpack and extract the inflection data.

Here's an example entry from the Italian-English Oxford dictionary:

Code:
<idx:orth value="abbacchiato">
    <idx:infl>
        <idx:iform name="" value="abbacchiata"/>
        <idx:iform name="" value="abbacchiate"/>
        <idx:iform name="" value="abbacchiati"/>
    </idx:infl>
</idx:orth>
The Babylon GLS equivalent is:

Code:
abbacchiato|abbacchiata|abbacchiate|abbacchiati
Doitsu is offline   Reply With Quote
Old 11-03-2020, 11:52 AM   #11
mzel
Enthusiast
mzel began at the beginning.
 
Posts: 42
Karma: 10
Join Date: Apr 2016
Device: Kobo Forma
re 4) IMHO the only dictionary specific part of the .syn is the set of basic words. Otherwise it should be language specific. As far as I understand .syn is part of the input, not output for the Stardict converter.
re 5) That again would only be the intermediate point to create a .syn for Stardict. BTW do you have the link to those dictionaries?
mzel is offline   Reply With Quote
Old 11-03-2020, 01:12 PM   #12
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,059
Karma: 17116635
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by mzel View Post
As far as I understand .syn is part of the input, not output for the Stardict converter.
No, the opposite is true. StarDict Editor will automatically generate a .syn file if the Babylon GLS source file contains inflections.
This also means that the .syn files are not interchangable.

You might want to compile a simple Babylon GLS source file yourself with StarDict Editor. (I attached a small test file to this post; the input file is fr_en_sample.babylon)

Quote:
Originally Posted by mzel View Post
BTW do you have the link to those dictionaries?
No, but if you own a registered eInk Kindle, you can download them for free.
Doitsu is offline   Reply With Quote
Old 11-03-2020, 02:27 PM   #13
pazos
cosiñeiro
pazos ought to be getting tired of karma fortunes by now.pazos ought to be getting tired of karma fortunes by now.pazos ought to be getting tired of karma fortunes by now.pazos ought to be getting tired of karma fortunes by now.pazos ought to be getting tired of karma fortunes by now.pazos ought to be getting tired of karma fortunes by now.pazos ought to be getting tired of karma fortunes by now.pazos ought to be getting tired of karma fortunes by now.pazos ought to be getting tired of karma fortunes by now.pazos ought to be getting tired of karma fortunes by now.pazos ought to be getting tired of karma fortunes by now.
 
Posts: 671
Karma: 815871
Join Date: Apr 2014
Device: BQ Cervantes 4
Quote:
Originally Posted by mzel View Post
2) Trying to build a command line Goldendict for Kobo and write a plugin for it in koreader
3) Implementing those same rules in .lua wrapper around sdcv and trying to find a closest match from koreader
Contributions are always welcome.

Keep in mind a few things:

- Goldendict is not a dict format. Is an app supporting plenty of dict formats. You'll need to choose which one of the supported goldendict formats you want to support, peek into goldendict code and get something that works for that specific format.

- There's no need to build a commandline tool. A lua C module would be fine too.

- Lua is too slow for a dict app. The code needs to be written in C/C++. The interface between your code and KOReader can be coded as you want.

From the supported colordict formats slob and zim look the most interesting to me, but I don't know if they have the features you need.

The end goal is to have a program/library that works in all KO devices, which are mostly linux arm devices. Support for 3rd party apps is already available on devices that are intended to run apps (android, linux, mac)
pazos is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Question/Wish in support of external dictionaries on Android Norbi24 KOReader 9 09-05-2019 05:46 PM
Touch Dictionaries shouled Kobo Reader 2 07-19-2012 07:52 PM
colordict translation dictionaries question sovre Android Devices 1 02-26-2012 06:35 PM
Just got K3 and need some help with 3G and dictionaries... pollo Amazon Kindle 1 12-29-2011 06:13 PM
A question about translation dictionaries Nate the great Workshop 5 06-05-2009 09:10 AM


All times are GMT -4. The time now is 01:59 AM.


MobileRead.com is a privately owned, operated and funded community.