06-11-2022, 04:28 PM | #1 |
Evangelist
Posts: 496
Karma: 356531
Join Date: Jul 2016
Location: 'burta, Canada
Device: Kobo Glo HD
|
Implementing predictive text in English (sort of; let's discuss)
I'm not sure how many were aware, but when you switch your device to Japanese, you get access to a predictive text keyboard. Predictive text is important because Japanese has four different writing systems and text input is done phonetically through the Roman alphabet, so the system needs a way to translate those Roman letters to the appropriate kana or kanji character (For example, there are many characters that can represent the syllable "ka" so predictive text allows you select which one you mean).
I find predictive text useful on touch devices, so I wanted to see if it could be used for English words because the functionality is already there. TL;DR: It sort of works when you replace the input method's Japanese dictionaries with English versions pulled from a PS Vita (screenshots attached) but only in the Japanese locale and as soon as you type in a valid Japanese syllable, it'll display hiragana or katakana suggestions only (so that part must be built in), which makes it less useful as an English keyboard. But there's a proof-of-concept here. Some notes about the Japanese keyboard:
This is what the JA directory looks like on the Kobo: Code:
$ ls -Rl .: total 24 drwxrwxr-x+ 1 Reg Tiangha Reg Tiangha 0 Jun 11 12:57 32/ -rwxrwxr-x+ 1 Reg Tiangha Reg Tiangha 18817 May 31 13:16 njcon.a ./32: total 4692 -rwx---r-x+ 1 Reg Tiangha Reg Tiangha 82671 May 31 13:16 njexyomi.a -rwx---r-x+ 1 Reg Tiangha Reg Tiangha 9122 May 31 13:16 njexyomi_new.a -rwx---r-x+ 1 Reg Tiangha Reg Tiangha 16717 May 31 13:16 njexyomi_re.a -rwx---r-x+ 1 Reg Tiangha Reg Tiangha 5785 May 31 13:16 njfzk.a -rwx---r-x+ 1 Reg Tiangha Reg Tiangha 81907 May 31 13:16 njtan.a -rwx---r-x+ 1 Reg Tiangha Reg Tiangha 516199 May 31 13:16 njubase1.a -rwx---r-x+ 1 Reg Tiangha Reg Tiangha 4071969 May 31 13:16 njubase2.a The JA directory structure and files are a bit different: Code:
$ ls -Rl .: total 20 drwxrwxr-x+ 1 Reg Tiangha Reg Tiangha 0 Jun 11 10:58 16/ -rwxr-xr-x 1 Reg Tiangha Reg Tiangha 18817 Jun 11 10:59 njcon.a ./16: total 4628 -rwx---r-x+ 1 Reg Tiangha Reg Tiangha 82299 Jun 11 10:58 njexyomi.a -rwx---r-x+ 1 Reg Tiangha Reg Tiangha 5785 Jun 11 10:58 njfzk.a -rwx---r-x+ 1 Reg Tiangha Reg Tiangha 81907 Jun 11 10:58 njtan.a -rwx---r-x+ 1 Reg Tiangha Reg Tiangha 513643 Jun 11 10:58 njubase1.a -rwx---r-x+ 1 Reg Tiangha Reg Tiangha 4046711 Jun 11 10:58 njubase2.a Code:
$ ls -Rl .: total 1617 drwxrwxr-x+ 1 Reg Tiangha Reg Tiangha 0 Jun 11 11:01 GB/ drwxrwxr-x+ 1 Reg Tiangha Reg Tiangha 0 Jun 11 11:01 US/ -rwxr-xr-x 1 Reg Tiangha Reg Tiangha 160 Jun 11 11:00 njcon.a -rwxr-xr-x 1 Reg Tiangha Reg Tiangha 137152 Jun 11 11:00 njubase1.a -rwxr-xr-x 1 Reg Tiangha Reg Tiangha 258682 Jun 11 11:00 njubase2.a -rwxr-xr-x 1 Reg Tiangha Reg Tiangha 1247452 Jun 11 11:00 njubase3.a -rwxr-xr-x 1 Reg Tiangha Reg Tiangha 1759 Jun 11 11:00 njyomi.a ./GB: total 8 -rwx---r-x+ 1 Reg Tiangha Reg Tiangha 4608 Jun 11 11:01 njubase1gb.a ./US: total 4 -rwx---r-x+ 1 Reg Tiangha Reg Tiangha 3584 Jun 11 11:01 njubase1us.a To test compatibility, I erased the contents of the JA directory on the Kobo and replaced verbatim with the Vita version. It didn't work. But when I renamed the 16 directory to 32 to match the original directory structure, it DID work and I could do Japanese word lookups. So this proved dictionary compatibility. So I tried the same with the EN dictionaries, first by copying EN directory to the /usr/local/Kobo/dic directory, but nothing happened, both in English mode on the Japanese keyboard, and in the English locale. Next, I tried copying over everything verbatim into the JA folder; no dice. So I created a folder named 32 and moved every file except for njcon.a into it, and TA-DA! English word lookups started to work (screenshots attached)! Here is what the final directory structure looked like using just the EN data: Code:
$ ls -Rl .: total 1 drwxr-xr-x 1 Reg Tiangha Reg Tiangha 0 Jun 11 13:54 32/ -rwxr-xr-x 1 Reg Tiangha Reg Tiangha 160 Jun 11 11:00 njcon.a ./32: total 1616 drwxrwxr-x+ 1 Reg Tiangha Reg Tiangha 0 Jun 11 11:01 GB/ drwxrwxr-x+ 1 Reg Tiangha Reg Tiangha 0 Jun 11 11:01 US/ -rwxr-xr-x 1 Reg Tiangha Reg Tiangha 137152 Jun 11 11:00 njubase1.a -rwxr-xr-x 1 Reg Tiangha Reg Tiangha 258682 Jun 11 11:00 njubase2.a -rwxr-xr-x 1 Reg Tiangha Reg Tiangha 1247452 Jun 11 11:00 njubase3.a -rwxr-xr-x 1 Reg Tiangha Reg Tiangha 1759 Jun 11 11:00 njyomi.a ./32/GB: total 8 -rwx---r-x+ 1 Reg Tiangha Reg Tiangha 4608 Jun 11 11:01 njubase1gb.a ./32/US: total 4 -rwx---r-x+ 1 Reg Tiangha Reg Tiangha 3584 Jun 11 11:01 njubase1us.a And for fun, I tried mixing and matching dictionary files to see if I could do both English and Kanji lookups, and after various combinations, I found that the three njubase.a files from the EN distribution definitely needed to be there to do English word lookups. Limitations: Unfortunately, as soon as you enter a valid Japanese syllable (ex. ta, ke, mo, etc.), all of the English suggestions disappear and are replaced with hiragana or katakana (or kanji, if the other dictionary files are also present) suggestions only. So this limits the usefulness somewhat. But we do have a proof of concept that works as long as you have the dictionary data. Questions:
This was a fun exercise, but I think I've taken it as far as I can with my current skills. What do other people think? Last edited by rtiangha; 06-11-2022 at 05:54 PM. |
06-11-2022, 04:46 PM | #2 |
Evangelist
Posts: 496
Karma: 356531
Join Date: Jul 2016
Location: 'burta, Canada
Device: Kobo Glo HD
|
Also, if you were wondering how I got English labels in Japanese mode, I followed tshering's old trick of extracting the English translation file from nickel, renaming it trans_ja.qm and placing it into /usr/local/Kobo/translations/ (do NOT flash the file in that link; it's way too old). That way, everything is translated into English.
|
Advert | |
|
06-12-2022, 04:12 AM | #3 |
Connoisseur
Posts: 92
Karma: 10988
Join Date: Dec 2018
Device: Kobo Clara HD
|
Fascinating! This would definitely be a nice to have.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to modify calibre text sort algorithm | khonshu | Library Management | 3 | 01-27-2020 04:13 PM |
Is the price drop of the $50 7" Fire this week predictive of a new model? | bonacker | Kindle Fire | 5 | 12-19-2016 06:50 PM |
sort of English Duokan | FethryDuck | Kindle Developer's Corner | 868 | 07-05-2012 06:58 PM |
PRS-650 English text with some non-English characters show as ? | Gorit | Sony Reader | 1 | 03-06-2012 08:39 AM |
Concise Oxford English Dictionary - doesn't always return to text | andavane | Bookeen | 12 | 03-26-2009 08:43 AM |