Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Readers > Kobo Reader

Notices

Reply
 
Thread Tools Search this Thread
Old 11-16-2012, 01:00 PM   #1
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
A suggestion to the Kobo developers

Edit: This post is outdated, cf. this post.
I tested the Japanese dictionary of the KT for some time. I am sorry to say but what I found out is that the dictionary function is of almost no use. I am not saying that the dictionary itself is too small or of minor quality. This might be the case or not. The real problem, however, is that the search engine is unable to retrieve the proper data from the dictionary with any certainty. The reason for this is easy to see. First thing the search engine does is retrieve from a database one (of usually several possible) phonetic representations (=Kana) of the searched for Kanji(s). The fact that in many cases there are several possible phonetic representations makes the selection of one representation without further effective checking rather arbitrary. This is the first weak point.
In the next step, the phonetic representation together with the Kanji(s) is looked up in an index file ("words") in order to ascertain whether the dictionary has an entry that corresponds to the searched for word or expression. If the result is that there is a corresponding entry in the dictionary the search engine turns to a certain html file in order to retrieve the dictionary entry. The name of the html file is determined by the first two letters of the phonetic representation. Whether this is the appropriate html file depends therefore on whether the Kanji-Kana matching fitted (per chance) the context. Even if the appropriate html file is accessed there is no guarantee that the search engine will come up with the correct result, because at this stage, the engine has already "forgotten" which kanji the user was looking for. Therefore it happily presents the first entry that matches the phonetic representation as the result.
The fact that the Japanese language has an extremely large number of homophones speaks clearly against the chosen approach.
My suggestion would therefore be to take the Kanjis (or Kanji+Kana expressions as might be the case) as the primary means of organizing the material. The only necessary modification to the general procedure (followed by the KT with the other languages) would be that the name of the html files would consist in only one letter (= Kanji). This would result in approximately 4749 html files in the case of a decent Japanese Dictionary (The present Japanese dictionary consist of ca. 4100 files). My calculation is based on a version of the free edict dictionary file that contains 205 721 entries (cf. http://www.csse.monash.edu.au/~jwb/edict.html). In order to take care of those words that are commonly written in Kana, rather than in Kanji, and also words starting with numeric characters some further html files would be needed (232 in the case of my edict version).
I think the improvement would be tremendous and the efforts worthwhile. I would be happy to answer any questions you may have.

Last edited by tshering; 12-20-2012 at 02:52 PM.
tshering is offline   Reply With Quote
Old 11-20-2012, 10:57 PM   #2
Uschiekid
Tenrec
Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.
 
Posts: 724
Karma: 1076988
Join Date: Oct 2012
Device: Kobo Aura One, Kobo Glo
sort of related...
are you currently able to go into the dictionary function, type in a word in japanese and look it up? or only by highlighting one or more kanji/kana?
i used to be able to do the aforementioned but after resetting because of a different problem occurring, could no longer do this in japanese (and weirdly, not with french either). Just curious if there are others out there who can....
and you seem to be a lonely one out there who cares about japanese dictionary function....


Quote:
Originally Posted by tshering View Post
I tested the Japanese dictionary of the KT for some time. I am sorry to say but what I found out is that the dictionary function is of almost no use. I am not saying that the dictionary itself is too small or of minor quality. This might be the case or not. The real problem, however, is that the search engine is unable to retrieve the proper data from the dictionary with any certainty. The reason for this is easy to see. First thing the search engine does is retrieve from a database one (of usually several possible) phonetic representations (=Kana) of the searched for Kanji(s). The fact that in many cases there are several possible phonetic representations makes the selection of one representation without further effective checking rather arbitrary. This is the first weak point.
In the next step, the phonetic representation together with the Kanji(s) is looked up in an index file ("words") in order to ascertain whether the dictionary has an entry that corresponds to the searched for word or expression. If the result is that there is a corresponding entry in the dictionary the search engine turns to a certain html file in order to retrieve the dictionary entry. The name of the html file is determined by the first two letters of the phonetic representation. Whether this is the appropriate html file depends therefore on whether the Kanji-Kana matching fitted (per chance) the context. Even if the appropriate html file is accessed there is no guarantee that the search engine will come up with the correct result, because at this stage, the engine has already "forgotten" which kanji the user was looking for. Therefore it happily presents the first entry that matches the phonetic representation as the result.
The fact that the Japanese language has an extremely large number of homophones speaks clearly against the chosen approach.
My suggestion would therefore be to take the Kanjis (or Kanji+Kana expressions as might be the case) as the primary means of organizing the material. The only necessary modification to the general procedure (followed by the KT with the other languages) would be that the name of the html files would consist in only one letter (= Kanji). This would result in approximately 4749 html files in the case of a decent Japanese Dictionary (The present Japanese dictionary consist of ca. 4100 files). My calculation is based on a version of the free edict dictionary file that contains 205 721 entries (cf. http://www.csse.monash.edu.au/~jwb/edict.html). In order to take care of those words that are commonly written in Kana, rather than in Kanji, and also words starting with numeric characters some further html files would be needed (232 in the case of my edict version).
I think the improvement would be tremendous and the efforts worthwhile. I would be happy to answer any questions you may have.
Uschiekid is offline   Reply With Quote
Advert
Old 11-21-2012, 04:15 AM   #3
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
Quote:
Originally Posted by Uschiekid View Post
are you currently able to go into the dictionary function, type in a word in japanese and look it up? or only by highlighting one or more kanji/kana? i used to be able to do the aforementioned but after resetting because of a different problem occurring, could no longer do this in japanese (and weirdly, not with french either).
No. Same with me. I cannot go into the dictionary function of French and Japanese dictionaries, and maybe of some other languages (I did not try all of them).
I made however a small sample Japanese-English dictionary (first 50000 entries or so of Jim Breen's Edict) using the pattern of the Kobo English dictionary and so on, rather than the pattern of the Kobo Japanese Dictionary. After installing it in the disguise of the German dictionary it works as expected and I can access the dictionary function too. In order to type Hiragana/Katakana one has however to change the user language to Japanese.
If you would like to search the French dictionary, just rename it to dicthtml-de.zip or another dictionary language that you do not use (and that allows searching).
tshering is offline   Reply With Quote
Old 11-21-2012, 10:49 AM   #4
vice-versa
whippet addict
vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.
 
vice-versa's Avatar
 
Posts: 380
Karma: 689884
Join Date: Dec 2011
Location: France, Normandy, Gisors
Device: Kobo Glo, Kobo Aura 6", Kobo Glo HD, Kobo Aura One, Kobo Sage
I'm not fluent enough to read in japanese, but I noticed something funny... I mostly read books in english, but sometime find some interesting books in french too (I set my main language to french on the device, as I'm French). Lately, I encountered a word in a french book that I wanted to look at. I selected it and it was the english-english dictionnary that popped up, no way to have access to the french-french dictionnary An other thing : no proper name show up in the dictionnary (that can be useful to place a town or an historical character).
vice-versa is offline   Reply With Quote
Old 11-21-2012, 12:49 PM   #5
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
Quote:
Originally Posted by vice-versa View Post
Lately, I encountered a word in a french book that I wanted to look at. I selected it and it was the english-english dictionnary that popped up, no way to have access to the french-french dictionnary
Interesting! Did this happen with all words that you looked up? Or did you look up only this one word of this book? This behavior might depend on incorrect language settings of the book.


Quote:
Originally Posted by vice-versa View Post
An other thing : no proper name show up in the dictionnary (that can be useful to place a town or an historical character).
Maybe you would like to add a further French dictionary to your reader, as ShellShock did it with English dictionaries (link). Information on how to build dictionaries is available here (link), especially here (link). ShellShock will perhaps give some detailed instructions for Windows users (cf. this post). I am not sure whether one can find an appropriate French dictionary for conversion. I think the Encyclopédie de Diderot et d'Alembert might be useful for older stuff, and one can download it easily (http://encyclopédie.eu/A.html etc).
tshering is offline   Reply With Quote
Advert
Old 11-21-2012, 03:48 PM   #6
Uschiekid
Tenrec
Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.Uschiekid ought to be getting tired of karma fortunes by now.
 
Posts: 724
Karma: 1076988
Join Date: Oct 2012
Device: Kobo Aura One, Kobo Glo
Quote:
Originally Posted by tshering View Post
No. Same with me. I cannot go into the dictionary function of French and Japanese dictionaries, and maybe of some other languages (I did not try all of them).
I made however a small sample Japanese-English dictionary (first 50000 entries or so of Jim Breen's Edict) using the pattern of the Kobo English dictionary and so on, rather than the pattern of the Kobo Japanese Dictionary. After installing it in the disguise of the German dictionary it works as expected and I can access the dictionary function too. In order to type Hiragana/Katakana one has however to change the user language to Japanese.
If you would like to search the French dictionary, just rename it to dicthtml-de.zip or another dictionary language that you do not use (and that allows searching).
Interesting system....didn't realise that type of thing was possible

I'm hoping they fix these dictionaries soon, I can be patient for now anyway, since I don't currently have any japanese books I want to read (just dled some random free ones to experiment with)....I'm just keeping my eye on the situation to see how things unfold.

Quote:
Maybe you would like to add a further French dictionary to your reader, as ShellShock did it with English dictionaries (link). Information on how to build dictionaries is available here (link), especially here (link). ShellShock will perhaps give some detailed instructions for Windows users (cf. this post). I am not sure whether one can find an appropriate French dictionary for conversion. I think the Encyclopédie de Diderot et d'Alembert might be useful for older stuff, and one can download it easily (http://encyclopédie.eu/A.html etc).
but if i DO get desperate, I might just look into what you suggested above....thanks for the links!


Quote:
Today 10:49 vice-versa
I'm not fluent enough to read in japanese, but I noticed something funny... I mostly read books in english, but sometime find some interesting books in french too (I set my main language to french on the device, as I'm French). Lately, I encountered a word in a french book that I wanted to look at. I selected it and it was the english-english dictionnary that popped up, no way to have access to the french-french dictionnary An other thing : no proper name show up in the dictionnary (that can be useful to place a town or an historical character).
Sounds quite annoying. I am less than impressed that the french and japanese dictionaries are less than fully functional...especially with kobo's connection to both canada and japan
Uschiekid is offline   Reply With Quote
Old 11-21-2012, 04:18 PM   #7
vice-versa
whippet addict
vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.vice-versa ought to be getting tired of karma fortunes by now.
 
vice-versa's Avatar
 
Posts: 380
Karma: 689884
Join Date: Dec 2011
Location: France, Normandy, Gisors
Device: Kobo Glo, Kobo Aura 6", Kobo Glo HD, Kobo Aura One, Kobo Sage
to be honest, I don't want to bother myself with compiling a dictionnary. I'd rather note the words I want to look at on a paper (or I could highlight them at least) and then search the web for that when I'm in front of my laptop. As for the book I'm currently reading in french, I just checked in calibre (I convert all my books from epub to epub in calibre) the metadata : the langage is set to french... so I don't know why it's the english-english dictionnary that pop up...
vice-versa is offline   Reply With Quote
Old 12-20-2012, 02:47 PM   #8
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
I am happy to see that the Japanese definition dictionary (dated 21.11.2012) has been modified as I suggested in the first post of this thread. I do not know whether this was in response to this thread or whether it is a case of coincidence. Anyway, I am thankful for the change and like to say thank you to all decision makers and developers involved.
tshering is offline   Reply With Quote
Old 01-06-2013, 02:11 PM   #9
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
Today, my Touch updated to 2.3.1. Therefore, I have now a first hand experience of the new dictionary engine for Japanese. It is still not possible to edit the searched for term. The effect of this is that if you point at an inflected word chances are high that you will not get the wished for dictionary entry, and there is no way to modify the ending to the dictionary form. I therefore still prefer the custom Japanese-English dictionary provided here.

The same bug is also still present in the French definition dictionary.

Last edited by tshering; 01-07-2013 at 04:22 AM.
tshering is offline   Reply With Quote
Old 01-06-2013, 09:14 PM   #10
murg
No Comment
murg ought to be getting tired of karma fortunes by now.murg ought to be getting tired of karma fortunes by now.murg ought to be getting tired of karma fortunes by now.murg ought to be getting tired of karma fortunes by now.murg ought to be getting tired of karma fortunes by now.murg ought to be getting tired of karma fortunes by now.murg ought to be getting tired of karma fortunes by now.murg ought to be getting tired of karma fortunes by now.murg ought to be getting tired of karma fortunes by now.murg ought to be getting tired of karma fortunes by now.murg ought to be getting tired of karma fortunes by now.
 
Posts: 3,238
Karma: 23878043
Join Date: Jan 2012
Location: Australia
Device: Kobo: Not just an eReader, it's an adventure!
Quote:
Originally Posted by tshering View Post
Today, my Touch updated to 3.1.1.
3.1.1?
murg is offline   Reply With Quote
Old 01-07-2013, 04:21 AM   #11
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
Quote:
Originally Posted by murg View Post
3.1.1?
Thank you for pointing at the typo. I corrected the original post.
tshering is offline   Reply With Quote
Old 02-01-2018, 12:26 PM   #12
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
I would like to mention that the performance of the English definitions dictionary (and presumably also of other dictionaries) could easily be improved by making better use of the variants that are recorded in the dictionary. Currently, variants seem not to be taken into consideration, when they are available only in abbreviated form, like "-aries" in the case of "boundary". To check this, I went through the dictionary and resolved the abbreviated variants. I add some images for illustration.
I resolved a little more than 7500 abbreviated variants. The advantage, however, is less than this number might suggest, since a lot of cases are covered by a kind of automatic matching, which seems successful in cases like radiate : radiated.
Attached Thumbnails
Click image for larger version

Name:	screen_20180201_172603.png
Views:	187
Size:	31.0 KB
ID:	161995   Click image for larger version

Name:	screen_20180201_172728.png
Views:	184
Size:	32.0 KB
ID:	161996   Click image for larger version

Name:	screen_20180201_172832.png
Views:	181
Size:	32.4 KB
ID:	161997   Click image for larger version

Name:	screen_20180201_173033.png
Views:	165
Size:	29.6 KB
ID:	161998   Click image for larger version

Name:	screen_20180201_173237.png
Views:	175
Size:	35.2 KB
ID:	161999   Click image for larger version

Name:	screen_20180201_173414.png
Views:	169
Size:	33.1 KB
ID:	162000   Click image for larger version

Name:	screen_20180201_173608.png
Views:	171
Size:	32.4 KB
ID:	162001   Click image for larger version

Name:	screen_20180201_173147.png
Views:	173
Size:	37.3 KB
ID:	162002  

Last edited by tshering; 02-01-2018 at 12:30 PM.
tshering is offline   Reply With Quote
Reply

Tags
improvement, japanese dictionary

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Suggestion for Kobo folks - Firmware Updates TonyToews Kobo Reader 17 10-19-2012 02:12 AM
Can we have a kobo developers subforum tonyv Kobo Reader 6 07-15-2012 10:54 PM
An open letter to the Kobo developers SaveMyKobo Kobo Reader 13 07-05-2012 08:22 PM
Suggestion for Kobo Thasaidon Kobo Reader 2 10-24-2011 02:16 PM


All times are GMT -4. The time now is 11:14 AM.


MobileRead.com is a privately owned, operated and funded community.