11-16-2012, 01:00 PM | #1 |
Wizard
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
A suggestion to the Kobo developers
Edit: This post is outdated, cf. this post.
I tested the Japanese dictionary of the KT for some time. I am sorry to say but what I found out is that the dictionary function is of almost no use. I am not saying that the dictionary itself is too small or of minor quality. This might be the case or not. The real problem, however, is that the search engine is unable to retrieve the proper data from the dictionary with any certainty. The reason for this is easy to see. First thing the search engine does is retrieve from a database one (of usually several possible) phonetic representations (=Kana) of the searched for Kanji(s). The fact that in many cases there are several possible phonetic representations makes the selection of one representation without further effective checking rather arbitrary. This is the first weak point. In the next step, the phonetic representation together with the Kanji(s) is looked up in an index file ("words") in order to ascertain whether the dictionary has an entry that corresponds to the searched for word or expression. If the result is that there is a corresponding entry in the dictionary the search engine turns to a certain html file in order to retrieve the dictionary entry. The name of the html file is determined by the first two letters of the phonetic representation. Whether this is the appropriate html file depends therefore on whether the Kanji-Kana matching fitted (per chance) the context. Even if the appropriate html file is accessed there is no guarantee that the search engine will come up with the correct result, because at this stage, the engine has already "forgotten" which kanji the user was looking for. Therefore it happily presents the first entry that matches the phonetic representation as the result. The fact that the Japanese language has an extremely large number of homophones speaks clearly against the chosen approach. My suggestion would therefore be to take the Kanjis (or Kanji+Kana expressions as might be the case) as the primary means of organizing the material. The only necessary modification to the general procedure (followed by the KT with the other languages) would be that the name of the html files would consist in only one letter (= Kanji). This would result in approximately 4749 html files in the case of a decent Japanese Dictionary (The present Japanese dictionary consist of ca. 4100 files). My calculation is based on a version of the free edict dictionary file that contains 205 721 entries (cf. http://www.csse.monash.edu.au/~jwb/edict.html). In order to take care of those words that are commonly written in Kana, rather than in Kanji, and also words starting with numeric characters some further html files would be needed (232 in the case of my edict version). I think the improvement would be tremendous and the efforts worthwhile. I would be happy to answer any questions you may have. Last edited by tshering; 12-20-2012 at 02:52 PM. |
11-20-2012, 10:57 PM | #2 | |
Tenrec
Posts: 724
Karma: 1076988
Join Date: Oct 2012
Device: Kobo Aura One, Kobo Glo
|
sort of related...
are you currently able to go into the dictionary function, type in a word in japanese and look it up? or only by highlighting one or more kanji/kana? i used to be able to do the aforementioned but after resetting because of a different problem occurring, could no longer do this in japanese (and weirdly, not with french either). Just curious if there are others out there who can.... and you seem to be a lonely one out there who cares about japanese dictionary function.... Quote:
|
|
11-21-2012, 04:15 AM | #3 | |
Wizard
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
Quote:
I made however a small sample Japanese-English dictionary (first 50000 entries or so of Jim Breen's Edict) using the pattern of the Kobo English dictionary and so on, rather than the pattern of the Kobo Japanese Dictionary. After installing it in the disguise of the German dictionary it works as expected and I can access the dictionary function too. In order to type Hiragana/Katakana one has however to change the user language to Japanese. If you would like to search the French dictionary, just rename it to dicthtml-de.zip or another dictionary language that you do not use (and that allows searching). |
|
11-21-2012, 10:49 AM | #4 |
whippet addict
Posts: 382
Karma: 689884
Join Date: Dec 2011
Location: France, Normandy, Gisors
Device: Kobo Glo, Kobo Aura 6", Kobo Glo HD, Kobo Aura One, Kobo Sage
|
I'm not fluent enough to read in japanese, but I noticed something funny... I mostly read books in english, but sometime find some interesting books in french too (I set my main language to french on the device, as I'm French). Lately, I encountered a word in a french book that I wanted to look at. I selected it and it was the english-english dictionnary that popped up, no way to have access to the french-french dictionnary An other thing : no proper name show up in the dictionnary (that can be useful to place a town or an historical character).
|
11-21-2012, 12:49 PM | #5 | ||
Wizard
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
Quote:
Quote:
|
||
11-21-2012, 03:48 PM | #6 | |||
Tenrec
Posts: 724
Karma: 1076988
Join Date: Oct 2012
Device: Kobo Aura One, Kobo Glo
|
Quote:
I'm hoping they fix these dictionaries soon, I can be patient for now anyway, since I don't currently have any japanese books I want to read (just dled some random free ones to experiment with)....I'm just keeping my eye on the situation to see how things unfold. Quote:
Quote:
|
|||
11-21-2012, 04:18 PM | #7 |
whippet addict
Posts: 382
Karma: 689884
Join Date: Dec 2011
Location: France, Normandy, Gisors
Device: Kobo Glo, Kobo Aura 6", Kobo Glo HD, Kobo Aura One, Kobo Sage
|
to be honest, I don't want to bother myself with compiling a dictionnary. I'd rather note the words I want to look at on a paper (or I could highlight them at least) and then search the web for that when I'm in front of my laptop. As for the book I'm currently reading in french, I just checked in calibre (I convert all my books from epub to epub in calibre) the metadata : the langage is set to french... so I don't know why it's the english-english dictionnary that pop up...
|
12-20-2012, 02:47 PM | #8 |
Wizard
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
I am happy to see that the Japanese definition dictionary (dated 21.11.2012) has been modified as I suggested in the first post of this thread. I do not know whether this was in response to this thread or whether it is a case of coincidence. Anyway, I am thankful for the change and like to say thank you to all decision makers and developers involved.
|
01-06-2013, 02:11 PM | #9 |
Wizard
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
Today, my Touch updated to 2.3.1. Therefore, I have now a first hand experience of the new dictionary engine for Japanese. It is still not possible to edit the searched for term. The effect of this is that if you point at an inflected word chances are high that you will not get the wished for dictionary entry, and there is no way to modify the ending to the dictionary form. I therefore still prefer the custom Japanese-English dictionary provided here.
The same bug is also still present in the French definition dictionary. Last edited by tshering; 01-07-2013 at 04:22 AM. |
01-06-2013, 09:14 PM | #10 |
No Comment
Posts: 3,238
Karma: 23878043
Join Date: Jan 2012
Location: Australia
Device: Kobo: Not just an eReader, it's an adventure!
|
|
01-07-2013, 04:21 AM | #11 |
Wizard
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
|
02-01-2018, 12:26 PM | #12 |
Wizard
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
I would like to mention that the performance of the English definitions dictionary (and presumably also of other dictionaries) could easily be improved by making better use of the variants that are recorded in the dictionary. Currently, variants seem not to be taken into consideration, when they are available only in abbreviated form, like "-aries" in the case of "boundary". To check this, I went through the dictionary and resolved the abbreviated variants. I add some images for illustration.
I resolved a little more than 7500 abbreviated variants. The advantage, however, is less than this number might suggest, since a lot of cases are covered by a kind of automatic matching, which seems successful in cases like radiate : radiated. Last edited by tshering; 02-01-2018 at 12:30 PM. |
Tags |
improvement, japanese dictionary |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Suggestion for Kobo folks - Firmware Updates | TonyToews | Kobo Reader | 17 | 10-19-2012 02:12 AM |
Can we have a kobo developers subforum | tonyv | Kobo Reader | 6 | 07-15-2012 10:54 PM |
An open letter to the Kobo developers | SaveMyKobo | Kobo Reader | 13 | 07-05-2012 08:22 PM |
Suggestion for Kobo | Thasaidon | Kobo Reader | 2 | 10-24-2011 02:16 PM |