02-01-2020, 01:23 PM | #31 | |
Connoisseur
Posts: 76
Karma: 10742
Join Date: Jul 2017
Location: Serbia
Device: Kobo Aura One
|
Quote:
I use a free bit of kit called GoldenDict for my dictionaries, which accepts nearly every format known to man. The versatility means that there's a terrific community of modders/rippers/compilers, and I've amassed quite a hoard of dictionaries. If I had to guess, I'd say the source was a .dsl (a semi-proprietary format of the ABBYY Lingvo dictionary software) of the 6th edition of the SOED. I've always presumed that this version came from a rip of the CD-ROM data. I'm still using that dsl in GoldenDict on my desktop and Android. Comparing the results of identical searches in Kobo and GoldenDict gives identical results, so it would make sense. I do remember having to double convert some dictionaries, once with some ancient tool that converted what I had to a format that Penelope (was it even called that back then?) of that time could read, and once with Penelope itself to get it to work with Kobo. Perhaps that's what I did with the SOED? Regardless, it converted remarkably well. The file even retained style rules, layout and line breaks, which not all of my conversions did. Drop me a line if you'd like to tinker with it. |
|
02-01-2020, 01:23 PM | #32 |
Grand Sorcerer
Posts: 6,216
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
@geek1011,
I copied the whole directory to my Win10 C:\Program Files directory. The following CMD Code:
"C:\Program Files\marisa64\usr\local\bin\marisa-build.exe" -o words2 index_coe.txt Code:
"The code execution cannot proceed because libgcc_s_sjlj-1.dll was not found. Reinstalling the program may fix this problem." In case this is relevant, I believe the much older Marisa Windows executables posted here on MR were 32-bit. They were all much smaller than these new ones and they were all standalone .exe's which could be copied to wherever was most convenient at the time. I say "all", but I think I've only actually used marisa-build.exe. ETA: This file was also flagged as missing, libstdc++-6.dll Last edited by jackie_w; 02-01-2020 at 01:54 PM. Reason: ETA |
02-01-2020, 01:41 PM | #33 |
Guru
Posts: 880
Karma: 270656
Join Date: Jun 2016
Device: Kobo
|
|
02-01-2020, 01:49 PM | #34 |
Wizard
Posts: 2,788
Karma: 6990707
Join Date: May 2016
Location: Ontario, Canada
Device: Kobo Mini, Aura Edition 2 v1, Clara HD
|
Marisa tools for Windows
Try this one. I've built it statically (./configure --enable-shared=no --enable-static=yes --host=i686-w64-mingw32 LDFLAGS="-static -static-libgcc -static-libstdc++"). The other one worked for me when testing in wine, and I didn't test it on Windows. I've tested this one in an actual Windows VM, so it should work fine (the binaries are standalone). P.S. The reason these are so much larger is I'm cross compiling c++ with mingw rather than msvc. Last edited by geek1011; 02-02-2020 at 11:03 PM. Reason: added header |
02-01-2020, 02:21 PM | #35 | |
Grand Sorcerer
Posts: 6,216
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
Quote:
Now to experiment with marisa-dump.exe ... |
|
02-01-2020, 03:44 PM | #36 |
Evangelist
Posts: 495
Karma: 356531
Join Date: Jul 2016
Location: 'burta, Canada
Device: Kobo Glo HD
|
Just generally commenting on how pyglossary is a wonderful tool, especially when teamed up with penelope for Kobos. And it seems like all the cool kids are using mdict these days (cough)...
|
02-02-2020, 04:25 PM | #37 |
Wizard
Posts: 2,788
Karma: 6990707
Join Date: May 2016
Location: Ontario, Canada
Device: Kobo Mini, Aura Edition 2 v1, Clara HD
|
Kobo prefix logic
Here's all the logic used to generate the prefixes. I've tested it against libnickel, and it's also based on the disassembly of DictionaryParser::htmlForWord (it was slightly annoying that most of the Qt stuff was inlined). I've simplified it and improved the performance, and also left in the original code for reference. I'll write some proper documentation and make a thread when I finish dictutil later. Here is the code: https://sourcegraph.com/github.com/g...util.go#L30-86 v1/v2 dictionary stuff And here are some useful notes about v1/v2 dictionaries (this hasn't ever been discussed, or even noticed before AFAIK): https://pgaskin.net/dictutil/dicthtml/v1v2 Last edited by geek1011; 02-02-2020 at 11:03 PM. Reason: added header |
02-02-2020, 06:16 PM | #38 | |
Evangelist
Posts: 495
Karma: 356531
Join Date: Jul 2016
Location: 'burta, Canada
Device: Kobo Glo HD
|
Quote:
Looking forward to seeing your documentation on the dictionary format and the definition of the various tags. Am wondering if there are any obscure tags that I have yet to encounter. I'm curious about Kanji: Will dictutil be able to handle those properly? Just wondering since the code says it's a special case. One of my main interests is in making bilingual Japanese word lists of, say my Anki flash card deck or one of my Japanese textbooks (in say csv or TAB file format) to help simplify definitions to my reading level and to keep the definitions consistent to what I'm learning and may be tested on later, but Japanese in particular has given me the most problems with dictionaries made with Penelope sometimes working and sometimes not. I suppose I could write and tag and sort into various files my own version manually, but I'd like to avoid that, if possible. And I believe that tshering discovered that kanji look up only really works properly when using one of the built in Japanese language dictionaries (either jaaxdis, en-ja or en-ja-pgs), especially if you're not using the Japanese locale (in order to bring up the Japanese keyboard, I guess; not sure how it works with Chinese now that it is a supported language but word highlight/look up still seems fine regardless of OS language) because it may use a different function compared to the other languages. Is that still the case, and if so, is it possible (maybe through a patch?) to make Kanji lookup work regardless of the dictionary selected (for example, in order to have more than 3 different Japanese-related dictionaries installed)? At the very least, I want to create a ja-en dictionary, and while I'm using norbusan's utility to enhance the built in jaxxdis dictionary, I really would love to create an updated one based on JMDict or this random Kenkyuusha one that somehow made its way into my possession (cough). Last edited by rtiangha; 02-02-2020 at 06:38 PM. |
|
02-02-2020, 10:58 PM | #39 | |
Wizard
Posts: 2,788
Karma: 6990707
Join Date: May 2016
Location: Ontario, Canada
Device: Kobo Mini, Aura Edition 2 v1, Clara HD
|
Quote:
Cross-compiling libmarisa I've finally got marisa to cross-compile properly on all platforms for dictutil: https://github.com/geek1011/dictutil...8d1cb469d4da8a. I can't believe it actually worked though, as I did it by writing a tool to merge all the C* sources into a single file and resolving includes to allow it to be easily compiled by CGO. Testing Kobo prefix generation I've also written dictword-test (https://pgaskin.net/kobo-mods/dictword-test/), which allows you to test libnickel's prefix generation directly. This tool is completely self-contained and doesn't conflict with or require patching libnickel. Binaries are available here: https://ci.appveyor.com/project/geek...uild/artifacts. Last edited by geek1011; 02-02-2020 at 11:02 PM. Reason: dictword-test |
|
02-03-2020, 08:35 PM | #40 | |
Enthusiast
Posts: 30
Karma: 10
Join Date: Jan 2020
Device: Kobo Libra H2O
|
Quote:
Thank you geek1011 and rtiangha! (And everyone else who helped me get to this point) (BTW rtiangha, I love the dictionary+thesaurus, especially the formatting! So far the word look-ups are accurate, with the one exception of the word 'invalided', but the WordNet 2 file I found here has the definition, so I'm happy ) |
|
03-01-2020, 05:59 AM | #42 | |
Member
Posts: 12
Karma: 10
Join Date: Oct 2019
Device: Onyx Boox Max Lumi 2, Kobo Libra H2O, Kindle Oasis 3
|
Quote:
|
|
03-02-2020, 04:01 PM | #43 |
Resident Curmudgeon
Posts: 75,889
Karma: 134368292
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
How easy is it to convert a Kindle dictionary to a Kobo dictionary?
|
03-02-2020, 04:36 PM | #44 | |
Guru
Posts: 880
Karma: 270656
Join Date: Jun 2016
Device: Kobo
|
Quote:
stardict ⇒ kobo (Penelope v3.1.3) |
|
Tags |
dictionary, kobo |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Custom Chinese-English dictionary | tshering | Kobo Developer's Corner | 62 | 08-03-2024 06:17 PM |
Custom Japanese-English dictionary | tshering | Kobo Developer's Corner | 55 | 10-13-2018 09:43 AM |
Dictionary plugin in Sigil? For example Oxford-English Dictionary. | Rindr | Plugins | 2 | 03-04-2018 11:11 AM |
English-English Dictionary for 301 | LevAizik | PocketBook | 6 | 12-03-2013 09:42 PM |
PB302 - How to replace English->Russian dictionary with English only (with defin.)? | guyanonymous | PocketBook | 29 | 08-03-2010 06:05 PM |