Dictionaries used by the Tolino app are stored under
.tolino/dictionaries/ on the user data partition. The format used is that of
QuickDic (*.quickdic).
Existing Dictionaries
The original
QuickDic was an Android app written by Thad Hughes and eventually open-sourced. Dictionary files were hosted on Google Code and available for download but all of them got deleted and were apparently lost when Google shut down the website. A
Web Archive snapshot of the project repository is available but files cannot be downloaded this way.
The project was later resurrected as
QuickDic Restored by Reimar Döffinger. The author's repository contains a lot of dictionaries generated from Wiktionary, a sister project of Wikipedia, which was also the source of the original QuickDic data. However, as part of his work on the app, the author improved the dictionary format, which means that newer dictionaries (v007 instead of v006) are no longer compatible with the Tolino.
These Wiktionary-based dictionaries can be downloaded on GitHub:
Make sure to download the files labeled
v006 only.
Creating Dictionaries: The Tool
DictionaryPC is a Java tool for generating QuickDic dictionaries accompanying the QuickDic app:
GitHub user
Gitsaibot authored shell scripts for generating QuickDic dictionaries specifically with the Tolino in mind (the .jar file here is exactly the same as in the original project):
Since it is a Java application, it needs JRE to run (
portable version). Further, it requires the following classes:
Common Compress,
Common Lang3,
International Components for Unicode,
Xerces-J Impl.
For convenience, I packaged everything necessary to run it in a Windows environment into a single archive, which I named QuickDicBuilder. Here's how to use it:- Download and unpack: QuickDicBuilder.zip
- Edit QuickDicBuilder.cmd and set JAVA_EXE to point to the Java binary on your system.
- QuickDicBuilder can now be called just like any other command-line utility.
Note: Thad Hughes are Reimar Döffinger are the original authors, I am only redistributing this. For source code, please refer to the GitHub links above.
Creating Dictionaries: How to Use It
The dictionary generation tool is functional but not very well documented. Some extra information how it is supposed to be used can be obtained by reading
old, closed GitHub issues and
its source code.
The utility supports several input formats: "Wiktionary", "tab_separated", and "Chemnitz". The latter format follows that of several German dictionaries
available here. Tab-separated is the most straightforward format to use. Perhaps it's best to illustrate how to use it by example.
Case #1: Dict.cc
Dict.cc dictionaries can be downloaded (for personal use) from:
https://www1.dict.cc/translation_file_request.php
I downloaded their Russian-English dictionary, and converted it to QuickDic format with the following command:
QuickDicBuilder --dictInfo="Dict.cc Russian-English" --dictOut="RU-EN_DictCC.quickdic" --input1="dictcc.ru-en.txt" --input1Charset=UTF8 --input1Format=tab_separated --input1Name="dictcc" --lang1="RU" --lang1Stoplist="StopLists\xx.txt" --lang2="EN"
I did not have a Russian stoplist so I used an empty one. Stoplists include frequently-appearing words that should be dropped from index. It'd probably be better to use one.
This conversion is relatively easy because the format of the downloaded file follows what the utility expects as its "tab_separated" input.
Case #2: CC-CEDICT
CC-CEDICT is a Chinese-English dictionary that can be downloaded from:
https://www.mdbg.net/chinese/dictionary?page=cc-cedict
Here, the conversion command was:
QuickDicBuilder --dictInfo="CC-CEDICT Chinese-English" --dictOut="CC-CEDICT.quickdic" --input1="cedict_ts.txt" --input1Charset=UTF8 --input1Format=tab_separated --input1Name="cc-cedict" --lang1="ZH" --lang1Stoplist="StopLists\xx.txt" --lang2="EN" --lang1Stoplist="StopLists\en.txt"
However, the input data needed to be rearranged first from:
SimplifiedHeadword TraditionalHeadword [Pronunciation] Definition
to:
SimplifiedHeadword TraditionalHeadword<Tab>Definition /Pronunciation/
For this purpose I used the following regular expression with
sed:
sed -e "s/^ *\([^ ]*\) \([^ ]*\) *\[ *\(.*\) *\] *\/ *\(.*\) *\/.*$/\1 \2\t\4 \/\3\//g" cedict_ts.u8 > cedict_ts.txt
Results
This was done quickly just to check if it works but if you want to, you can
download the dictionary files I generated.