Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > KOReader

Notices

Reply
 
Thread Tools Search this Thread
Old 08-27-2023, 06:35 AM   #1
Dezemberschnee
Junior Member
Dezemberschnee began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Aug 2023
Device: Tolino Vison 5
KOreader dictionaries: Morphology/Inflections?

Dear Forum,

I am currently using AlreaderX on my Tolino device, paired with goldendict for English, French and Swedish (Various dictionaries + hunspell files).

I have tried to use KOreader for its nicer dictionary interface. I converted my own stardict dictionaries based on dict.cc data, but found the lack of inflection/morphology support to be a dealbreaker, for French especially.

I found conflicting answers online, thus pleases excuse me asking: Is there currently any way to support morphologies/inflections in KOreader?

Thanks a lot!
Dezemberschnee is offline   Reply With Quote
Old 08-27-2023, 09:50 AM   #2
mergen3107
Wizard
mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.
 
mergen3107's Avatar
 
Posts: 1,060
Karma: 3000026
Join Date: Feb 2012
Location: Cape Canaveral
Device: Kindle Scribe
I am afraid it does not. KOReader uses sdcv as engine (Stardict), which does not support inflections, only fuzzy search.
mergen3107 is offline   Reply With Quote
Advert
Old 08-27-2023, 09:56 AM   #3
kandwo
Addict
kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.
 
Posts: 356
Karma: 10703708
Join Date: Dec 2020
Device: Kindle Paperwhite 3
It doesn't support morphology. The fuzzy search should be enough for English and Swedish.

I've managed to get Spanish to work quite well by using certain Wiktionary dictionaries for it. You could try and see if French is also supported well enough.

Try checking out this: https://github.com/BoboTiG/ebook-rea...s/fr/README.md (French-French Wiktionary)

Or for a French-English one: https://github.com/Vuizur/Wiktionary...tardict.tar.gz
kandwo is offline   Reply With Quote
Old 09-01-2023, 08:26 AM   #4
enigma90
Junior Member
enigma90 began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Feb 2019
Device: Kobo Glo HD
English Hunspell dictionaries work perfectly on Koreader. I got them from https://sourceforge.net/projects/wor...er/2020.12.07/
Just unzip the files and copy them to Koreader dict directory. For other languages, go to https://github.com/wooorm/dictionari...n/dictionaries
enigma90 is offline   Reply With Quote
Old 09-06-2023, 02:51 AM   #5
nezih
Enthusiast
nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.
 
nezih's Avatar
 
Posts: 34
Karma: 11014
Join Date: Feb 2023
Device: Kobo Aura SE
Quote:
Originally Posted by Dezemberschnee View Post
Dear Forum,

I am currently using AlreaderX on my Tolino device, paired with goldendict for English, French and Swedish (Various dictionaries + hunspell files).

I have tried to use KOreader for its nicer dictionary interface. I converted my own stardict dictionaries based on dict.cc data, but found the lack of inflection/morphology support to be a dealbreaker, for French especially.

I found conflicting answers online, thus pleases excuse me asking: Is there currently any way to support morphologies/inflections in KOreader?

Thanks a lot!
You can add inflections to your existing dictionaries via this script: https://github.com/anezih/add_inflections
You can both use hunspell data under inflection_data folder and another dictionary as an inflection source. If the language you need is not under the aformentioned folder I can add them also. (Generally, free babylon dictionaries provide good inflection support, you can use them to bolster the dictionaries you have.)
nezih is offline   Reply With Quote
Advert
Old 09-13-2023, 09:34 AM   #6
Dezemberschnee
Junior Member
Dezemberschnee began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Aug 2023
Device: Tolino Vison 5
Quote:
Originally Posted by nezih View Post
You can add inflections to your existing dictionaries via this script: https://github.com/anezih/add_inflections
You can both use hunspell data under inflection_data folder and another dictionary as an inflection source. If the language you need is not under the aformentioned folder I can add them also. (Generally, free babylon dictionaries provide good inflection support, you can use them to bolster the dictionaries you have.)
Thank you for the link, I really want to try that also. I am very new to python, so I cannot yet get it to work, maybe I will be able to figure it out in the future.

However I had a look at the inflection_data, so far there seems to be no data for Swedish? Maybe that could be added.
Dezemberschnee is offline   Reply With Quote
Old 09-13-2023, 09:36 AM   #7
Dezemberschnee
Junior Member
Dezemberschnee began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Aug 2023
Device: Tolino Vison 5
Quote:
Originally Posted by enigma90 View Post
English Hunspell dictionaries work perfectly on Koreader. I got them from https://sourceforge.net/projects/wor...er/2020.12.07/
Just unzip the files and copy them to Koreader dict directory. For other languages, go to https://github.com/wooorm/dictionari...n/dictionaries

Thanks for the reply. I have downloaded all available files (.aff, .ts, .dic, .js, .json) for all three languages and have put them in separate folders within in koreader/data/dict. The /dict folder already contains my exisiting dictionary files, also in separate folders. However, upon opening KOreader there are no changes, when I try setting the dictionary settings ("manage dictionaries"), there are no additional dictionaries to activate. I tried looking up a basic inflected word, which did not work, suggesting the hunspell dictionary is not active. Am I missing something?
Dezemberschnee is offline   Reply With Quote
Old 09-13-2023, 01:46 PM   #8
nezih
Enthusiast
nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.
 
nezih's Avatar
 
Posts: 34
Karma: 11014
Join Date: Feb 2023
Device: Kobo Aura SE
Quote:
Originally Posted by Dezemberschnee View Post
Thank you for the link, I really want to try that also. I am very new to python, so I cannot yet get it to work, maybe I will be able to figure it out in the future.

However I had a look at the inflection_data, so far there seems to be no data for Swedish? Maybe that could be added.
Added the language you requested.
nezih is offline   Reply With Quote
Old 09-14-2023, 04:18 AM   #9
kandwo
Addict
kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.
 
Posts: 356
Karma: 10703708
Join Date: Dec 2020
Device: Kindle Paperwhite 3
Tried running the script on a stardict dictionary: http://libredict.org/dictionaries/ru...2023-09-07.tgz.

I unzipped the dictionary and ran the script on the directory. However, I got an error message:

"[Errno 2] No such file or directory: '(long path)/Wiktionary Russian-Russian.ifo'"

The file clearly exists and I've double checked that the path to it is correct. Is it the spaces that are messing something up or something else that I'm not understanding?
kandwo is offline   Reply With Quote
Old 09-14-2023, 07:12 AM   #10
nezih
Enthusiast
nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.
 
nezih's Avatar
 
Posts: 34
Karma: 11014
Join Date: Feb 2023
Device: Kobo Aura SE
Quote:
Originally Posted by kandwo View Post
Tried running the script on a stardict dictionary: http://libredict.org/dictionaries/ru...2023-09-07.tgz.

I unzipped the dictionary and ran the script on the directory. However, I got an error message:

"[Errno 2] No such file or directory: '(long path)/Wiktionary Russian-Russian.ifo'"

The file clearly exists and I've double checked that the path to it is correct. Is it the spaces that are messing something up or something else that I'm not understanding?
Code:
python .\add_inflections.py --dict-file '.\dicts\Wiktionary Russian-Russian\Wiktionary Russian-Russian.ifo' -j .\inflection_data\Russian.json.gz
I was able to add inflections to the dictionary with the code above. Notice also that I put the files closer to the script, you can try that.

With the unmunched data in inflection_data folder, 1,940,291 synword has been added to dictionary. Here is the ifo file of the output:
Spoiler:
StarDict's dict ifo file
version=3.0.0
bookname=Wiktionary Russian-Russian
wordcount=430770
idxfilesize=12875217
synwordcount=1940291
description=
nezih is offline   Reply With Quote
Old 09-14-2023, 10:28 AM   #11
kandwo
Addict
kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.kandwo ought to be getting tired of karma fortunes by now.
 
Posts: 356
Karma: 10703708
Join Date: Dec 2020
Device: Kindle Paperwhite 3
Quote:
Originally Posted by nezih View Post
Code:
python .\add_inflections.py --dict-file '.\dicts\Wiktionary Russian-Russian\Wiktionary Russian-Russian.ifo' -j .\inflection_data\Russian.json.gz
I was able to add inflections to the dictionary with the code above. Notice also that I put the files closer to the script, you can try that.

With the unmunched data in inflection_data folder, 1,940,291 synword has been added to dictionary. Here is the ifo file of the output:
Spoiler:
StarDict's dict ifo file
version=3.0.0
bookname=Wiktionary Russian-Russian
wordcount=430770
idxfilesize=12875217
synwordcount=1940291
description=
I didn't realize I had to target the .ifo file specifically. Doing that it all worked. It created a new folder containing the old files and a .syn file in addition.

I've tested it briefly and it seems to work rather well in most cases. The dictionary itself isn't the best due to bad formatting and lack of word stress.

I realized that the Russian wiktionary that can be downloaded from within Koreader itself seems better with an even bigger .syn file (almost twice as big). However, some words just aren't found for whatever reason. Since that dictionary already comes with a .syn file I suppose it would be superfluous to run this script on it, too?

I'll have to experiment further when I have the time.
kandwo is offline   Reply With Quote
Old 09-14-2023, 11:00 AM   #12
nezih
Enthusiast
nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.
 
nezih's Avatar
 
Posts: 34
Karma: 11014
Join Date: Feb 2023
Device: Kobo Aura SE
Quote:
Originally Posted by kandwo View Post
I didn't realize I had to target the .ifo file specifically. Doing that it all worked. It created a new folder containing the old files and a .syn file in addition.

I've tested it briefly and it seems to work rather well in most cases. The dictionary itself isn't the best due to bad formatting and lack of word stress.

I realized that the Russian wiktionary that can be downloaded from within Koreader itself seems better with an even bigger .syn file (almost twice as big). However, some words just aren't found for whatever reason. Since that dictionary already comes with a .syn file I suppose it would be superfluous to run this script on it, too?

I'll have to experiment further when I have the time.
Well, you can keep the existing infls with --keep flag and inject from other sources at the same time. Also, as I mentioned somewhere above, free Babylon [lang]-English dictionaries provide good inflection support, you can show them to the script as an inflection source too (Or any other supported dictionary. You can transfer inflections from the KOReader one to this one).
nezih is offline   Reply With Quote
Old 09-16-2023, 11:07 PM   #13
enigma90
Junior Member
enigma90 began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Feb 2019
Device: Kobo Glo HD
Quote:
Originally Posted by Dezemberschnee View Post
Thanks for the reply. I have downloaded all available files (.aff, .ts, .dic, .js, .json) for all three languages and have put them in separate folders within in koreader/data/dict. The /dict folder already contains my exisiting dictionary files, also in separate folders. However, upon opening KOreader there are no changes, when I try setting the dictionary settings ("manage dictionaries"), there are no additional dictionaries to activate. I tried looking up a basic inflected word, which did not work, suggesting the hunspell dictionary is not active. Am I missing something?
You need only .aff & dic files to make it work. You can use different file names for different languages, so there's no need to put the files to separate folders. Just put them under dict folder. And don't forget to turn on "enable fuzzy search" on Koreader.
enigma90 is offline   Reply With Quote
Old 09-18-2023, 02:42 AM   #14
nezih
Enthusiast
nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.nezih can tame squirrels without the assistance of a chair or a whip.
 
nezih's Avatar
 
Posts: 34
Karma: 11014
Join Date: Feb 2023
Device: Kobo Aura SE
Quote:
Originally Posted by enigma90 View Post
You need only .aff & dic files to make it work. You can use different file names for different languages, so there's no need to put the files to separate folders. Just put them under dict folder. And don't forget to turn on "enable fuzzy search" on Koreader.
AFAIK, fuzzy search has nothing to do with Hunspell files and sdcv has no support for Hunspell morphology.
nezih is offline   Reply With Quote
Old 03-30-2024, 01:31 AM   #15
sricochet
Member
sricochet began at the beginning.
 
Posts: 19
Karma: 10
Join Date: Sep 2014
Device: Kindle Scribe
when I convert my Tabfile to the dictionary file, I get the following output:

Preparing the inflection sources... Done.
Reading the input dictionary... Done.
> Processed 76,280 / ? words. Total inflections found: 78
Writing the output file(s)... Done.

I am not sure what "? words" means, but it says that there's only 78 inflections found.



Edit: I was able to make some progress by combining unmunched json inflection files from different dictionaries. Up to 304 inflections found on the one dictionary and 869 on the other one, but still getting a question mark. Is it possible that it's having difficulty with the bilingual aspect of the dictionary?

Edit: using the above method:

Quote:
Well, you can keep the existing infls with --keep flag and inject from other sources at the same time. Also, as I mentioned somewhere above, free Babylon [lang]-English dictionaries provide good inflection support, you can show them to the script as an inflection source too (Or any other supported dictionary.
works quite well when making a tabfile out of various obtained .dic files and wiktionary dumps.

I must admit this is quite a useful script. Very much appreciated. Thank you!

Last edited by sricochet; 04-01-2024 at 01:37 AM.
sricochet is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
KOreader cannot handle certain dictionaries LittleBiG KOReader 7 11-24-2020 07:36 AM
Best dictionaries for koreader Alan_S KOReader 11 12-18-2018 07:13 PM
DSL dictionaries within KOReader? jcn363 KOReader 4 09-20-2017 11:05 AM
Dictionaries and identical inflections Hatgirl Amazon Kindle 10 01-12-2014 05:29 PM
Inflections (Kindle dictionaries) LucasCorso Amazon Kindle 3 03-17-2011 07:47 AM


All times are GMT -4. The time now is 01:43 PM.


MobileRead.com is a privately owned, operated and funded community.