Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Readers > Kobo Reader

Notices

Reply
 
Thread Tools Search this Thread
Old 03-05-2020, 05:38 PM   #1
geek1011
Wizard
geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.
 
Posts: 1,793
Karma: 4402832
Join Date: May 2016
Location: Canada
Device: Kobo Mini, Aura Edition 2 v1, Clara HD
dictutil: Tools, documentation, and libraries related to Kobo dictionaries

dictutil
Tools, documentation, and libraries related to Kobo dictionaries (and a few converted ones).
___



This project contains a collection of tools and libraries to work with Kobo dictionaries, plus comprehensive documentation of Kobo's dictionary format.

Unlike previous attempts at working with Kobo dictionaries, dictutil has full support for all features supported by nickel (word prefixes, unicode, variants, images, etc), with a focus on simplicity, correctness (prefix generation and other features are directly tested against libnickel's code and regexps, v1/v2 dictionaries are differentiated), and completeness (most of the research was done by reverse-engineering libnickel).

In addition, it has a custom format for creating Kobo dictionaries which has a simple syntax and full support for all features.

Dictutil consists of multiple tools and libraries:
  • dictutil provides commands for installing, removing, unpacking, packing, and performing low-level modifications and tests on Kobo dictionaries. All operations are intended to be correct, lossless, and deterministic.
  • dictgen simplifies creating full-featured dictionaries for Kobo eReaders, with support for images, unicode prefixes, raw html, markdown, and more.
  • dicthtml documents Kobo's dictionary format and how it works.
  • examples/gotdict-convert is a working example of using dictutil to convert GOTDict into a Kobo dictionary.
  • examples/webster1913-convert is a working example of using dictutil to convert Project Gutenberg's Webster's Unabridged Dictionary into a Kobo dictionary.
  • examples/dictzip-decompile is an experimental tool to convert a dictzip into a dictfile.
  • examples/bgl-convert is a simple tool to convert Babylon BGL dictionaries to a dictfile.
  • Library: kobodict provides support for reading, writing, encrypting, and decrypting Kobo dictionaries.
  • Library: dictgen provides the functionality of dictgen as a library.
  • Library: marisa provides a simplified self-contained CGO wrapper for marisa-trie.
Dictutil implements version 2 of the Kobo dictionary format, which supports firmware versions 4.7.10364+.

See the website for more details and examples.

Quick reference:

dictgen:
Spoiler:
Code:
Usage: dictgen [options] dictfile...

Options:
  -o, --output string         The output filename (will be overwritten if it exists) (- is stdout) (default "dicthtml.zip")
  -c, --crypt string          Encrypt the dictzip using the specified encryption method (format: method:keyhex)
  -I, --image-method string   How to handle images (if an image path is relative, it is loaded from the current dir) (base64 - optimize and encode as base64, embed - add to dictzip, remove) (default "base64")
      --remove-footer         Add code to prevent the non-applicable dictionary source footer for certain locales from being added after the entry (e.g. if replacing the French dictionary)
  -h, --help                  Show this help text

If multiple dictfiles (*.df) are provided, they will be merged (duplicate entries are fine; they will be shown in sequential order). To read from stdin, use - as the filename.

Note that the only usable image method is currently removing them or using base64-encoding (for firmware 4.20.14601+; older versions segfault in the in-book dictionary), as embedded dict:/// image URLs cause the webviews to appear blank (this is a nickel bug). See https://github.com/geek1011/dictutil/issues/1 for more details.

See https://pgaskin.net/dictutil/dictgen/ for more information about the dictfile format.

dictgen dictfile format:
Spoiler:
Code:
- `@ HEADWORD`: Start a new entry. The headword doesn't have to be unique, and can contain spaces.
  - Header
    - `: WORD_INFO` or `::` *(optional)*: Add extra word info after the headword, or remove it entirely.
    - `& VARIANT` *(optional)*: Add an additional word to match. Follows the same rules as the headword. Can be repeated multiple times.
  - Body
    - `MARKDOWN` or `<html> RAW_HTML`: Include a definition written in Markdown or raw HTML code.

dictutil:
Spoiler:
Code:
Usage: dictutil command [options] [arguments]

Dictutil provides low-level utilities to manipulate Kobo dictionaries (v2).

Commands:
  install (I)          Install a dictzip file
  pack (p)             Pack a dictzip file
  prefix (x)           Calculate the prefix for a word
  uninstall (U)        Uninstall a dictzip file
  unpack (u)           Unpack a dictzip file
  help                 Show help for all commands

Options:
  -h, --help   Show this help text

dictutil install:
Spoiler:
Code:
Usage: dictutil install [options] dictzip

Options:
  -k, --kobo string      KOBOeReader path (default: automatically detected)
  -l, --locale string    Locale name to use (format: ALPHANUMERIC{2}; translation dictionaries are not supported) (default: detected from filename if in format dicthtml-**.zip)
  -n, --name string      Custom additional label for dictionary (ignored when replacing built-in dictionaries) (doesn't have any effect on 4.20.14601+)
  -b, --builtin string   How to handle built-in locales [replace = replace and prevent from syncing] [ignore = replace and leave syncing as-is] (default "replace")
  -h, --help             Show this help text

Note:
  If you are not replacing a built-in dictionary, the 'Enable searches on extra
  dictionaries patch' must be installed, or you will not be able to select
  your custom dictionary.

dictutil uninstall:
Spoiler:
Code:
Usage: dictutil uninstall [options] locale

Options:
  -k, --kobo string      KOBOeReader path (default: automatically detected)
  -b, --builtin string   How to handle built-in locales [normal = uninstall the same way as the UI] [delete = completely delete the entry (doesn't have any effect on 4.20.14601+)] [restore = download the original dictionary from Kobo again] (default "normal")
  -h, --help             Show this help text

dictutil pack:
Spoiler:
Code:
Usage: dictutil pack [options] dictdir

Options:
  -o, --output string   The output dictzip filename (will be overwritten if it exists) (default "dicthtml.zip")
  -c, --crypt string    Encrypt the dictzip using the specified encryption method (format: method:keyhex)
  -h, --help            Show this help text

dictutil unpack:
Spoiler:
Code:
Usage: dictutil unpack [options] dictzip

Options:
  -o, --output string   The output directory (must not exist) (default: the basename of the input without the extension)
  -c, --crypt string    Decrypt the dictzip (if needed) using the specified encryption method (format: method:keyhex)
  -h, --help            Show this help text

dictutil prefix:
Spoiler:
Code:
Usage: dictutil prefix [options] word...

Options:
  -f, --format string   The output format (go-slice, go-map, csv, tsv, json-array, json-object) (default "json-array")
  -h, --help            Show this help text

gotdict-convert:
Spoiler:
Code:
Usage: gotdict-convert [options]

Options:
  -g, --gotdict string   The path to the local copy of github.com/wjdp/gotdict. (default "./gotdict")
  -o, --output string    The output filename (will be overwritten if it exists) (- is stdout) (default "./gotdict.df")
  -I, --images           Include images in dictfile
  -h, --help             Show this help text

To convert the resulting dictfile into a dictzip, use dictgen.

webster1913-convert:
Spoiler:
Code:
Usage: webster1913-convert [options] gutenberg_webster1913_path

Options:
  -o, --output string   The output filename (will be overwritten if it exists) (- is stdout) (default "./webster1913.df")
      --dump            Instead of converting, dump the parsed dictionary to stdout as JSON (for debugging)
  -h, --help            Show this help text

Arguments:
  gutenberg_webster1913_path is the path to Project Gutenberg's Webster's 1913 dictionary. Use - to read from stdin.

To convert the resulting dictfile into a dictzip, use dictgen.

dictzip-decompile:
Spoiler:
Code:
Usage: dictzip-decompile [options] dictzip

Options:
  -o, --output string   The output filename (will be overwritten if it exists) (- is stdout) (default "./decompiled.df")
  -r, --resources       Also extract referenced resources to the current directory (warning: any existing files will be overwritten, so it is recommended to run in an empty directory if enabled)
  -h, --help            Show this help text

Arguments:
  dictzip is the path to the dictzip to decompile.

To convert the resulting dictfile into a dictzip, use dictgen.

Note: The regenerated dictzip from the dictfile may not match exactly, but it will look the same, and certain bugs with prefixes and variants will be implicitly fixed by the conversion process (i.e. variant in wrong file, incorrect prefix, missing words in index file). All output is in raw HTML, not Markdown.

This is an experimental tool, and the output may not be perfect on complex dictionaries.

Download | Website

Last edited by geek1011; 05-05-2020 at 12:12 PM. Reason: v0.3.0
geek1011 is offline   Reply With Quote
Old 03-05-2020, 05:38 PM   #2
geek1011
Wizard
geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.
 
Posts: 1,793
Karma: 4402832
Join Date: May 2016
Location: Canada
Device: Kobo Mini, Aura Edition 2 v1, Clara HD
Dictionaries

This post contains a pre-built dictionaries converted using the tools in examples/*-convert. Links to the source dictfiles are also included for use as examples, for merging with your own dictionaries, or for editing.

Webster's 1913 Unabridged Dictionary (from gutenberg.org/ebooks/29765)
GOTDict (from github.com/wjdp/gotdict)
Generated by dictutil v0.3.0.

Last edited by geek1011; 05-05-2020 at 12:17 PM. Reason: v0.3.0
geek1011 is offline   Reply With Quote
Advert
Old 03-05-2020, 05:39 PM   #3
geek1011
Wizard
geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.
 
Posts: 1,793
Karma: 4402832
Join Date: May 2016
Location: Canada
Device: Kobo Mini, Aura Edition 2 v1, Clara HD
Changelogs

v0.3.0:
Spoiler:
6c333b4 all: Switched windows builds to 64-bit
e442434 docs: Added links to pre-built dictionaries to the main header
50aa051 docs: Clarified the current situation with read-only dictionaries (#6)
939235f cmd/dictgen: Added option to prevent the footer from showing on certain locales (fixes #5)
8ef75c2 dictgen: Added PostRawHTML option to DictFileEntry
10e0680 all: Added integration tests for overall dictzip-related functionality
1a820d5 examples/webster1913-convert: Fixed false positives for headwords
dae2dd1 marisa: Fixed reading large tries, added debug env var
5bfa4c5 marisa: Finished with the last bit of cleanup
0239665 all: Make marisa optional for building kobodict
fdc6c66 marisa: Fixed a few subtle errors and simplified the stream shim
018a9ae docs: Updated links
a84722e docs: Added documentation for examples/bgl-convert
f490dfc examples/bgl-convert: Added new tool to convert Babylon BGL dictionaries
1a80ab6 marisa: Cleaned up everything
0b4392a marisa: Cleaned up imports
65ab5e6 all: Added Travis builds for testing on macOS
bbb361d marisa: Added support for directly using io.Reader/io.Writer
3930b08 all: Removed SWIG from AppVeyor builds
338b88a marisa: Fixed tests on non-amd64 architectures
c91db49 marisa: Rewrote wrapper to run completely in-memory and not depend on SWIG
3c56b2e kobodict: Abstracted marisa-trie bindings into an interface
61364e6 all: Checked built-in dictionaries against 4.20.14622
bca4c53 examples/webster1913-convert: Handle synonyms starting after the example of a numbered definition
0b8179b cmd/dictutil: Fixed missing newline in fmt.Printf

v0.2.1:
Spoiler:
ec6e0d7 all: Don't upload converted dictionaries from Drone
6913bda examples/webster1913-convert: Cleaned up leading and trailing whitespace in parsed dictionary
b6f76da cmd/dictutil: Added warning about dictionary label not having an effect on 4.20.14601+
96b86db all: Updated dictionary installation for 4.20.14601

v0.2.0:
Spoiler:
8a0e9ee all: Updated release build script
ccfb5f6 examples/dictzip-decompile: Implemented a tool to convert a dictzip into a dictfile
1e207d6 dictgen: Increased maximum line buffer size

v0.1.2:
Spoiler:
7e56ba0 cmd/dictutil: Fixed file permissions on Windows

v0.1.1:
Spoiler:
a761abd all: Fixed AppVeyor builds
652765b marisa: Removed pre-generated SWIG wrapper for now (fixes #2)
78994bb docs: Added MobileRead link to nav
0881c03 docs: Fixed link on homepage
c9e9847 examples/webster1913-convert: Fixed phrase definitions

v0.1.0:
Spoiler:
6024447 all: Updated dependencies
025a074 all: Fixed AppVeyor artifact download links
8dca7a7 all: Use AppVeyor to build pre-built dictionaries for examples/*
54d1ad1 examples/webster1913-convert: Separated parser into its own package
1aedd40 examples/gotdict-convert: Separated parser into its own package
e49fcfd all: Added package and command godoc comments
53ccb4e examples/webster1913-convert: Also add phrase definitions as variants so they can be looked up
555d4f1 examples/webster1913-convert: Added new tool for parsing and converting Gutenberg's Webster's 1913 dictionary
2e5270c docs: Improved nav order
1635ed0 docs: Finshed documentation for dictutil commands
f08f72b docs: Added getting started section to main page
164094b docs: Improved dicthtml format examples
382b790 docs: Finished dictgen and dictfile documentation
45c48a8 cmd/dictgen: Improved wording of help text
ad77e8c docs: Added note about custom dictionary labels on 4.20.14601
e984c78 cmd/dictutil: Added note about dictionary labels on 4.20.14601
712d5fa docs: Finished prefix documentation
6492cbc dictgen: Updated notes about image methods (#1)
10eadad docs: Updated gotdict-convert docs
da6d4ad docs: Updated dictgen docs
01da35f examples/gotdict-convert: Prefer the version with images (#1)
d05535b cmd/dictgen: Enable images with base64 encoding by default (#1)
512c333 docs: Updated for 4.20.14601
2a80ab3 all: Updated Go to 1.14
5931c7f examples/gotdict-convert: Made version string consistent with other dictutil commands
da99419 examples/gotdict-convert: Updated gotdict, removed workarounds
8de70b8 kobodict: Further simplified and improved performance of WordPrefix
bc52ded marisa: Removed compile-time dependency on SWIG
7cdb1d8 kobodict: Fixed memory leak from marisa
7276923 examples/gotdict-convert: Fixed build
3533800 all: Updated build dependency graph
641efd5 all: Disabled image support by default (#1)
b403da2 dictgen: Added Description func to ImageHandler
aa108b2 cmd/dictgen: Show more information about options
66d311f dictgen: Extracted ImageFunc from WriteDictZip for extensibility
b3acaf0 all: Added README, updated links in help text
2ccdbb6 dictgen: Added some doc comments
1aeb44b dictgen: Added support for image CSS, fixed images displaying too small
b56c1b8 examples/gotdict-convert: Implemented image support
a739fca cmd/dictgen: Implemented image support
46a4fd3 dictgen: Implemented image support
744fca8 kobodict: Added Exists function on Writer
c7a1fc8 cmd/dictutil: Fixed locale removal when ExtraLocales contains spaces
3d3008f docs: Misc fixes
54ab4a1 docs: Added stubs for all pages, wrote some more docs
ce7bc60 docs: Finished format docs
e1e4b25 cmd/dictutil: Implemented prefix command
ccc6c72 examples/gotdict-convert: Improved Drone config for uploads
2f4c068 kobodict: Improved parameter name
55fa1c6 all: Added version info to commands
dea5f2e all: Cleaned up old files, updated some TODOs
0ac4192 cmd/dictutil: Implemented uninstall command
0e4ba01 cmd/dictutil: Made help text more complete in Drone config
f2e29b8 cmd/dictutil: Implemented install command
c24c2fb marisa: Silenced the -Wimplicit-fallthrough warning
70ace33 docs: Added alternative dictionary installation method
d5bd3b7 all: Added cross-compilation and releases to Drone config
25e0621 cmd/dictutil: Changed return code to 0 for help text
6edaf98 docs: Fixed site URL
79969ed docs: Changed Jekyll theme to just-the-docs
932448c docs: Added info about installing dictionaries
fcb478e cmd/dictutil: Implemented dictutil command
99aedd6 dictgen: Added test to ensure the reparsing the generated dictfile is identical to the original
95ceef9 examples/gotdict-convert: Also generate and upload dictzip in Drone config
0551eb2 dictgen: Implemented Validate
2865b7c kobodict: Implemented Pack and Unpack
5539d29 all: Updated TODOs
eaf7d0c kobodict: Added stubs for Pack/Unpack
0dbd59c kobodict: Implemented Reader
7f22617 kobodict: Fixed build
384d955 cmd/dictgen: Implemented encryption
e89d5b5 dictgen: Improved API by allowing access to the kobodict.Writer
514a77c kobodict: Implemented Crypter, fixed bugs in Writer
ccf7f38 all: Updated Drone config
2927cb3 cmd/dictgen: Implemented initial version of the dictgen CLI
fccdcbb dictgen: Implemented dictzip generation
2d3b54c kobodict: Implemented kobodict.Writer
abda141 examples/gotdict-convert: Fixed definition image cleanup
0cc2589 cmd/dictutil: Reorganized files
2462a54 examples/gotdict-convert: Misc CLI improvements
3befc43 examples/gotdict-convert: Fixed CLI arguments
b54757e all: Added license
461c408 dictutil: Updated main func
ac9f819 docs: Added dictword-test and marisa to index
958390a all: Made Drone config more deterministic
33ff613 docs: Updated index
60664a6 examples/gotdict-convert: Implemented tool to convert github.com/wjdp/gotdict
91a6e55 marisa: Included merged libmarisa sources instead of depending on the lib.
81dbf10 all: Disable cross-compiling marisa for macOS for now
9b653e9 all: Cross-compile marisa
ce0107c all: Fixed Drone config
3ea1df4 marisa: Added Go wrapper
725206f all: Simplified Drone config
226f40a dictgen: Implemented parser and dicthtml generation
ec22c1a kobodict: Cleaned up outdated TODOs
c12cbf1 all: Ignore GH Pages build in Drone config
8512fa8 kobodict: Finished implementing reverse-engineered word prefix and normalization stuff
d5297b8 docs: Added note about kobo-mods/dictword-test
3758fcf docs: Added some notes about v1/v2 dictionaries and stubs for other docs
5a5a443 docs: Initial commit

Last edited by geek1011; 05-05-2020 at 12:14 PM. Reason: v0.3.0
geek1011 is offline   Reply With Quote
Old 03-05-2020, 05:40 PM   #4
geek1011
Wizard
geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.
 
Posts: 1,793
Karma: 4402832
Join Date: May 2016
Location: Canada
Device: Kobo Mini, Aura Edition 2 v1, Clara HD
reserved

Last edited by geek1011; 03-07-2020 at 08:42 AM.
geek1011 is offline   Reply With Quote
Old 03-06-2020, 12:36 AM   #5
geek1011
Wizard
geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.
 
Posts: 1,793
Karma: 4402832
Join Date: May 2016
Location: Canada
Device: Kobo Mini, Aura Edition 2 v1, Clara HD
I've released v0.1.1 with some small fixes to the converted Webster's dictionary, and a fix for segfaults from marisa on Windows, and v0.1.2 with a fix for file permissions on Windows. Thanks to Semwize for reporting the issues on Windows!

Last edited by geek1011; 03-06-2020 at 01:53 AM.
geek1011 is offline   Reply With Quote
Advert
Old 03-06-2020, 02:03 AM   #6
Semwize
 
Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.
 
Posts: 412
Karma: 121282
Join Date: Jun 2016
Device: Kobo
Works

It would be great to add stardict converter (xml) -> df
and very good - the ability to convert dicthtml -> df
Semwize is offline   Reply With Quote
Old 03-06-2020, 04:14 PM   #7
Cyfranek
Connoisseur
Cyfranek is a glorious beacon of lightCyfranek is a glorious beacon of lightCyfranek is a glorious beacon of lightCyfranek is a glorious beacon of lightCyfranek is a glorious beacon of lightCyfranek is a glorious beacon of lightCyfranek is a glorious beacon of lightCyfranek is a glorious beacon of lightCyfranek is a glorious beacon of lightCyfranek is a glorious beacon of lightCyfranek is a glorious beacon of light
 
Cyfranek's Avatar
 
Posts: 74
Karma: 12328
Join Date: Jul 2017
Location: Poland
Device: PocketBok Touch HD 3, Kindle Oasis 3, Tolino Vision 3 HD
@geek1011 please tell me:
- can I add my custom dictionary (Polish) using dicutil?
dictutil install dicthtml_pl.zip

- can I add my custom translation dictionary (English-Polish) using dicutil?
dictutil install dicthtml_en-pl.zip

Polish is not supported by Kobo, so I don't know if it's possible with Your tool. If not, can I replace other dictionary (Portugeese for instance) by Polish ones and prevent them from syncing "oryginal" one? How?
TIA
Cyfranek is offline   Reply With Quote
Old 03-06-2020, 04:41 PM   #8
Semwize
 
Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.
 
Posts: 412
Karma: 121282
Join Date: Jun 2016
Device: Kobo
Quote:
Originally Posted by Cyfranek View Post
- can I add my custom dictionary (Polish) using dicutil?
dictutil install dicthtml_pl.zip
Yes, the program will write to sqlite and place the dictionary in the desired Kobo folder .
Quote:
Originally Posted by Cyfranek View Post
- can I add my custom translation dictionary (English-Polish) using dicutil?
dictutil install dicthtml_en-pl.zip
No, rename it, for example dicthtml_e1.zip
Semwize is offline   Reply With Quote
Old 03-06-2020, 05:16 PM   #9
geek1011
Wizard
geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.
 
Posts: 1,793
Karma: 4402832
Join Date: May 2016
Location: Canada
Device: Kobo Mini, Aura Edition 2 v1, Clara HD
Semwize's answers are correct.

Quote:
Originally Posted by Cyfranek View Post
Polish is not supported by Kobo, so I don't know if it's possible with Your tool. If not, can I replace other dictionary (Portugeese for instance) by Polish ones and prevent them from syncing "oryginal" one? How?
You can replace any built-in locale, and dictutil will automatically prevent syncing the original one.

__
P.S. Semwize: Thanks for all the help with testing dictutil (I'd give you karma, but I have to spread it around first)!
geek1011 is offline   Reply With Quote
Old 03-06-2020, 05:21 PM   #10
geek1011
Wizard
geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.
 
Posts: 1,793
Karma: 4402832
Join Date: May 2016
Location: Canada
Device: Kobo Mini, Aura Edition 2 v1, Clara HD
Quote:
Originally Posted by Semwize View Post
It would be great to add stardict converter (xml) -> df
I've decided against that one, as dictutil is not intended to be a general-purpose dictionary tool (that's what Penelope is for), and because the stardict format is too broad (there are 4+ different body text formats I would have to implement from scratch).

Quote:
and very good - the ability to convert dicthtml -> df
I'm working on it right now, and am nearly finished.

It will support seamless decompilation of dictionaries generated by Penelope, Kobo (the unencrypted ones, if any), or dictgen (although it won't be able to recover the original Markdown or images, only the raw HTML). Dictionaries generated by other tools or manually created will also be able to decompiled, but it won't be able to automatically extract the header and generate the best dictfile.

It will also be useful for fixing bugs with prefixes, variants, and missing words in existing dictionaries (just decompile and regenerate the dictfile). Another use will be merging dictionaries (just decompile it, and merge the resulting dictfiles).

You will also be able to use it to convert stardict dictionaries by using Penelope to convert it to a dictzip, then this tool to decompile it to a dictfile, and then modify it from there.

Last edited by geek1011; 03-06-2020 at 05:24 PM.
geek1011 is offline   Reply With Quote
Old 03-06-2020, 06:23 PM   #11
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 57,151
Karma: 52070132
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Aura H2O, PRS-650, PRS-T1, nook STR, iPad 4, iPhone SE 2020, PW3
Will dictutil allow converting a Kindle format dictionary to a Kobo format dictionary?
JSWolf is online now   Reply With Quote
Old 03-06-2020, 06:55 PM   #12
geek1011
Wizard
geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.
 
Posts: 1,793
Karma: 4402832
Join Date: May 2016
Location: Canada
Device: Kobo Mini, Aura Edition 2 v1, Clara HD
Quote:
Originally Posted by JSWolf View Post
Will dictutil allow converting a Kindle format dictionary to a Kobo format dictionary?
No, as that would require implementing a MOBI parser and a whole lot more code. Besides, that's starting to get outside the scope of dictutil. You can use another tool to convert it to a usable format, then from there to a dictfile, though.
geek1011 is offline   Reply With Quote
Old 03-06-2020, 09:12 PM   #13
geek1011
Wizard
geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.
 
Posts: 1,793
Karma: 4402832
Join Date: May 2016
Location: Canada
Device: Kobo Mini, Aura Edition 2 v1, Clara HD
Quote:
Originally Posted by Semwize View Post
and very good - the ability to convert dicthtml -> df
I've released v0.2.0 with a new dictzip-decompile tool. See the documentation for more information.

In addition, I've increased the maximum line buffer size for dictgen, which fixes compiling some large dictfiles with long lines of generated HTML.
geek1011 is offline   Reply With Quote
Old 03-07-2020, 04:48 AM   #14
Semwize
 
Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.Semwize knows the chase is better than the catch.
 
Posts: 412
Karma: 121282
Join Date: Jun 2016
Device: Kobo
Here is what I will note. When creating Extra dictionaries with the help of Penelope, did not found (found, but no definition was given) in the dictionary phrases ('word word', 'word word word').

I creating the dictionary using dictutil and now everything is fine.
Quote:
Originally Posted by geek1011 View Post
I've decided against that one
Yes, it is superfluous in essence.

Last edited by Semwize; 03-07-2020 at 05:13 AM.
Semwize is offline   Reply With Quote
Old 03-07-2020, 08:40 AM   #15
geek1011
Wizard
geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.geek1011 ought to be getting tired of karma fortunes by now.
 
Posts: 1,793
Karma: 4402832
Join Date: May 2016
Location: Canada
Device: Kobo Mini, Aura Edition 2 v1, Clara HD
Quote:
Originally Posted by Semwize View Post
Here is what I will note. When creating Extra dictionaries with the help of Penelope, did not found (found, but no definition was given) in the dictionary phrases ('word word', 'word word word').

I creating the dictionary using dictutil and now everything is fine.
For the record, you will also note the same thing for words with accented characters. Under certain conditions, you'll also see the same for words starting with Cyrillic characters or containing non-ascii characters after the first two non-whitespace ones. In addition, this is more of a theoretical case, but you'll also see the same for words starting with whitespace.

See here and here.
geek1011 is offline   Reply With Quote
Reply

Tags
dictgen, dicthtml, dictionaries, dictionary, dictutil

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Adding new dictionaries to kobo. Robik Kobo Developer's Corner 50 04-17-2020 11:05 PM
Useful Related Tools Threads BetterRed Related Tools 3 05-17-2019 10:25 PM
Mini Sync Problems on 3 different Kobo Minis (related to software update?) jpottle Kobo Reader 2 06-20-2013 12:57 PM
Copy/Move books between libraries using command line tools jameszh Library Management 3 02-15-2011 09:02 AM
MSD (BBeB dictionaries) tools ? Papi LRF 0 12-20-2010 04:07 AM


All times are GMT -4. The time now is 09:30 AM.


MobileRead.com is a privately owned, operated and funded community.