![]() |
#1 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
Custom Japanese-English dictionary
The original opening of this post is now out-of-date, I therefore put in into the spoiler. Start with the newest dictionary file (dicthtml-ja.zip). It should be ok for most purposes. If you really need to search the Japanese dictionary, install the two other files and read the spoilers. (In more recent firmwares, for instance 3.12.0, you can rename japdic01.zip and japdic02.zip to dicthtml-ja-en, dicthhtml-ja-de or similar; cf. also this post.)
Spoiler:
NEW: With FW 2.3.1 the dictionary engine for Japanese has improved a lot. I therefore provide the EDICT for this version. In addition to the materials available in japdic01.zip and japdic02.zip it contains also all entries starting with fullwidth Latin characters (e.g., "S造" and "100円玉"). Unfortunately one has to replace the original dicthtml-ja.zip by this one, since renaming the file prevents it from functioning. A possibility to use the information of both dictionaries (the one provided by Kobo and the Edict) is to merge them into one. For copyright concerns I do not provide such a merged dictionary here. As it is even under FW 2.3.1 impossible to type into the search window of the Japanese dictionary, japdic01.zip and japdic02.zip are still more convenient to use in many cases. Spoiler:
DISCLAIMER You assume total responsibility and risk for your use of the provided files; use at your own risk. ACKNOWLEDGEMENT This package uses the EDICT dictionary file. This file is the property of the Electronic Dictionary Research and Development Group, and is used in conformance with the Group's licence. I would like to thank several forum members and especially those posting at this thread for advice and encouragement. EDIT: With more recent firmwares, the name of the Japanese dictionary has been changed to dicthtml-jaxxdjs.zip (instead of dicthtml-ja.zip). Last edited by tshering; 03-23-2017 at 07:05 PM. |
![]() |
![]() |
![]() |
#2 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24
Karma: 476046
Join Date: Nov 2012
Location: Poland
Device: Kobo Touch, Kindle K3
|
tshering I was searching for jap-eng dictionary for Kobo but there is none on the market. I have even tried creating one based on EDICT but have some issues. Now I am going to test yours. Will let you know about the results.
ありがとう! Last edited by andrusz; 11-29-2012 at 04:13 AM. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24
Karma: 476046
Join Date: Nov 2012
Location: Poland
Device: Kobo Touch, Kindle K3
|
According to your suggestions I have installed japdict01 as Portugues dictionary (which I had already installed) and it works very well :-) The only drawback is my Kobo Touch is using original Japanese dictionary every time I am searching for a kanji while reading a book. So I have to click "Dictionary" icon and change dictionary from Japanese to Portuguese. I have to do it for every kanji, it is little frustraiting...
So I have tried to replace Japanese dictionary with japdict01 (or japdict02) but it does not work for me. Kobo cannot find any kanji :-( Could you explain how to do it, if it is possible? Finally I have added japdict01 as a new dictionary jap-eng. It works very well, too :-) However the dictionary availability depends on the book. Some books allows you to choose "Translation dictionary" but most I have (from Aozora) does not. Very odd. tshering, thanks for a good job! Last edited by andrusz; 11-29-2012 at 04:14 AM. |
![]() |
![]() |
![]() |
#4 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
Quote:
Quote:
How did you do this? I would recommend using japdic01 and japdic02 both as definition dictionaries rather than one as translation dictionary (I personally opted for Spain and Dutch). If the language of a book is defined as Japanese (and you cannot or do not want to change it) the reader (usually?) does not offer you any translation dictionaries. The reason is that there is no original Japanese translation dictionary. Another advantage of having both defined as translation dictionaries is that in some situations it is easier to change from one translation dictionary to another translation dictionary as it is to change from a definition dictionary to a translation dictionary. |
||
![]() |
![]() |
![]() |
#5 | |||||
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24
Karma: 476046
Join Date: Nov 2012
Location: Poland
Device: Kobo Touch, Kindle K3
|
Quote:
![]() Quote:
- old format Japanese dictionary contains UTF-8 encoded filenames in the dicthtml-ja.zip file. But the files (descriptions) are encrypted: ![]() - new format Japanese dictionary contains Shift_JIS encoded (this is my guess - I cannot verify it) filenames in the dicthtml-ja.zip file. The files, however, are standard UTF-8 HTML definition files (gzipped): ![]() So it seems that changing the filenames in your dictionary from UTF-8 to Shift_JIS should "convert" translation dictionary to Japanese dictionary ![]() Quote:
![]() If you want the dictionary to be also shown in the Settings screen you need to add a line to "Dictionary" table in KoboReader database (values similar to other installed dictionaries, except "Suffix" column and "Name", of course). Quote:
![]() Quote:
![]() Last edited by andrusz; 11-30-2012 at 07:15 AM. |
|||||
![]() |
![]() |
Advert | |
|
![]() |
#6 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
Quote:
As for the filenames. According to my understanding, under FW 2.0.0 filenames were UTF-8 encoded, with later FWs they were 2-byte unicode encoded. And this is how I encoded them in my dictionary (namely 2-byte unicode). But sometimes it is hard to say what kind of encoding conversions are executed behind the screen (by the OS or decompressing tools.) By saying Quote:
Does not work for me. Which FW are you on? |
||
![]() |
![]() |
![]() |
#7 | |||
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24
Karma: 476046
Join Date: Nov 2012
Location: Poland
Device: Kobo Touch, Kindle K3
|
Quote:
![]() Quote:
![]() Quote:
You can easily check it is installed: - open any english book - activate a dictionary by touching a word - click on A/Z icon and choose "Translation dictionary" Your new dictionary should be on the list. |
|||
![]() |
![]() |
![]() |
#8 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
I have to confess that this is nonsense. The filenames are stored in the zip file in UTF-8 encoding. You can verify this by opening the zip file in a HEX editor and inspecting the filenames. When the files are unzipped the encoding of the filenames depends on the settings of the unzipping program and the OS. 7-zip under windows encodes them by default as UTC/unicode.
|
![]() |
![]() |
![]() |
#9 | |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24
Karma: 476046
Join Date: Nov 2012
Location: Poland
Device: Kobo Touch, Kindle K3
|
Quote:
I have observed the same on Windows 7 (english version) opening the files using Winzip: Old dictionary: ![]() New dictionary: ![]() Have you observed the same? |
|
![]() |
![]() |
![]() |
#10 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
andrusz, thank you for the images. Now I see that we talked about different things. When I talked about the old version I meant the first one that was contained in KoboRoot.tar of kobo3-update-2.0.0.zip. And what I called new version is your old version. I got different results with different tools.
I give below as an example the results for one and the same file extracted with differnet tools from 1) kobo3-update-2.0.0.zip, 2) your old version, 3) your new version. 1) kobo3-update-2.0.0.zip 7zFM (7-zip GUI): .html 12.07.2012 GnuWin32\bin\tar.exe: “†.html 12.07.2012 (this is utf8) 2) dicthtml-ja (old version) 7zFM (7-zip GUI): こう.html 07.08.2012 Host OS: Unix Windows zip (as used by the file explorer): .html 07.08.2012 3) dicthtml-ja (new version) 7zFM (7-zip GUI): .html 12.11.2012 Host OS: FAT Windows zip (as used by the file explorer): .html 12.11.2012 I do not know what kind of encoding is. Inside the html.gz-file the filename is again encoded in utf-8. It is nice that the dictionary zip file contains now the source file for "words" too! Last edited by tshering; 12-03-2012 at 12:56 PM. |
![]() |
![]() |
![]() |
#11 | ||
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24
Karma: 476046
Join Date: Nov 2012
Location: Poland
Device: Kobo Touch, Kindle K3
|
Quote:
![]() ![]() Quote:
![]() My source list looks like the words.original file. I am using Marisa to create words file and it works ok using marisa-lookup or marisa-predictive-search tool. But when the file is added to my test dictionary (to zip file) it does not work in Kobo ![]() What I have observed using Marisa tools is a different output using original Kobo words file comparing to mine. Below is the output from marisa-predictive-search. Original file: Code:
いえ 18 found 71 いえ いえ 3502 いえ【家】 いえ 10968 いえい【遺影】 いえ 24088 いえがまえ【家構え】 いえ 24089 いえがら【家柄】 いえ 10969 いえき【胃液】 いえ 10970 いえじ【家路】 いえ 10971 いえで【家出】 いえ 10972 いえども【雖も】 いえ 10973 いえなみ【家並み】 いえ Code:
いえ 43 found 58634 いえい【遺影】 58635 いえい【遺詠】 37152 いえい【家居】 37153 いえいえ【否否・否々】 37154 いえいえ【家々・家家】 37155 いえつき【家付き】 37156 いえつきのむすめ【家付きの娘】 37157 いえつきむすめ【家付き娘・家付娘】 18160 いえつづき【家続き】 58636 いえにかえる【家に帰る】 |
||
![]() |
![]() |
![]() |
#12 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
In the marisa output of your file, there is "いえ" missing at the end of each line. I guess therefore that there is something at the end of each line that prevents showing the rest of the line. Did your check that in your file all lines end on LF (without CR)?
|
![]() |
![]() |
![]() |
#13 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24
Karma: 476046
Join Date: Nov 2012
Location: Poland
Device: Kobo Touch, Kindle K3
|
|
![]() |
![]() |
![]() |
#14 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24
Karma: 476046
Join Date: Nov 2012
Location: Poland
Device: Kobo Touch, Kindle K3
|
Below is my Japanese-English dictionary based on EDict2. It has been created as a replacement of original Kobo Japanese dictionary.
To install just copy (and replace) dicthtml-ja.zip file in .kobo/dict folder in your Kobo reader. It would be a good idea to do a backup of the dicthtml-ja.zip file before overwriting. Japanese-English dictionary Warning: The file cannot be used as Japanase-English translation dictionary. It means you cannot change the name of the file and add it as an additional Kobo dictionary - it will not work this way! |
![]() |
![]() |
![]() |
#15 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
Great work, Andrusz! However, I would like to repeat that the search mechanism used for dicthtml-ja does not do a good job. For an explanation, see this post. You can verify what I am saying by pointing at, e.g., 調. With your dictionary loaded, the reader will provide as the closest match "しらべ [白檜] (n) (uk) Veitch's silver fir (Abies veitchii)/". The correct answer, however, is: "ちょう [調] /(n) (1) pitch/tone/key/(2) time/tempo/(n,suf) (3) mood/tendency/style/(n) (4) (arch) tax on products/". Note: My KT is still on FW 2.1.5. If the handling of the Japanese dictionary has improved under a more recent FW, please tell me. Because then I would manually upgrade.
|
![]() |
![]() |
![]() |
Tags |
dictionary, japanese, kobo |
Thread Tools | Search this Thread |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Japanese-English Kindle dictionary | karunaji | Amazon Kindle | 39 | 02-17-2013 12:51 AM |
japanese - english dictionary | nukito | Amazon Kindle | 1 | 12-23-2012 07:52 AM |
Touch building custom dictionaries, especially Japanese-English | tshering | Kobo Reader | 0 | 07-12-2012 07:00 PM |
PRS-T1 PRS-T1 Japanese version have japanese-english dic in it? | nukito | Sony Reader | 2 | 06-14-2012 03:37 PM |
English-Japanese dictionary + ebook reader? | roquet | Which one should I buy? | 0 | 11-07-2007 08:34 AM |