![]() |
#121 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,178
Karma: 2431850
Join Date: Sep 2008
Device: IPad Mini 2 Retina
|
It is certainly technically possible to convert a commercial dictionary to Kobo format, assuming you already own the commercial version and it is for personal use only; even then it may not be legal, but I am not a lawyer so you need to make your own judgement about that. Morally you could argue that you only bought the commercial version so you could format shift it to Kobo format, so it is a win for the publisher - they have sold a copy which they would not have done otherwise. Of course, the resulting Kobo dictionary must not be distributed to anyone else, which would definitely be morally and legally wrong.
|
![]() |
![]() |
![]() |
#122 | |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 311
Karma: 547600
Join Date: Jul 2010
Location: Paris
Device: Kindle Keyboard, Kindle NT, PRS-650
|
Quote:
|
|
![]() |
![]() |
Advert | |
|
![]() |
#123 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,178
Karma: 2431850
Join Date: Sep 2008
Device: IPad Mini 2 Retina
|
Have you seen the instructions attached to the first post in this thread? That gives the steps you need to create a Kobo dictionary, all of which can be automated. There are three main steps:
1. Converting the source dictionary into the html format used by the Kobo. 2. Creating an index from the html. 3. Packaging the html and index into the Kobo dictionary. Steps 2 & 3 are generic and are good candidates for a generic program; in fact I have code that does this already. I'm thinking about tidying it up and making it generally available; I don't know if there is much demand for it. It is Windows only. Step 1 is the most time consuming and is bespoke to each source dictionary, as they all have their own formats. Again for this step I write a program, but it is specific to each source dictionary. Depending on the source, it can be done manually using a text editor and e.g., regular expressions, or perhaps xslt tranformations if you have the skill. I think AlPe is also working on a converter program; see here. |
![]() |
![]() |
![]() |
#124 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 311
Karma: 547600
Join Date: Jul 2010
Location: Paris
Device: Kindle Keyboard, Kindle NT, PRS-650
|
Yep thanks I've read your thread plus this one, thanks to every one's work here it's pretty clear what has to be done, now what I don't know is what the mobipocket dictionary format looks like, but I think it doesn't belong in this thread, I'll do some research now...
|
![]() |
![]() |
![]() |
#125 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
Quote:
Quote:
If you did not find anything that worked for you you can try converting it with calibre to epub. This will give you one or several (x)html files. However, I did not try it myself. |
||
![]() |
![]() |
Advert | |
|
![]() |
#126 | ||
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 311
Karma: 547600
Join Date: Jul 2010
Location: Paris
Device: Kindle Keyboard, Kindle NT, PRS-650
|
Thanks for your answer tshering.
Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#127 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
This is strange. On my Touch, I have really to try hard if I want l' to be selected together with the following word.
|
![]() |
![]() |
![]() |
#128 | |
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
|
Quote:
I really wanted to spend some time on it last weekend, but my spare time was consumed discussing interesting stuff about (our) Audio-eBooks with DAISY. I will try to write the output function by the end of this week. Last edited by AlPe; 12-03-2012 at 08:47 AM. |
|
![]() |
![]() |
![]() |
#129 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
Did you try this with several books? Maybe your book uses a sign for an apostrophe that the reader does not recognize as delimiter.
|
![]() |
![]() |
![]() |
#130 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 311
Karma: 547600
Join Date: Jul 2010
Location: Paris
Device: Kindle Keyboard, Kindle NT, PRS-650
|
I tried yesterday, out of 4 books, one reacted as expected (' is a delimiter), the 3 others had an apostrophe not considered as a delimiter. Weird, I guess it's not the same exact character, I should check what's inside the file. But I also realized you can adjust the selection to whatever character you want, so it's fine, just a bit more annoying, so I won't try to deal with that at the dictionary level.
|
![]() |
![]() |
![]() |
#131 |
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
|
In many cases, eBook typographers uses a typographical apostrophe (U+2019) which renders better than the typewriter apostrophe (U+0027).
See: http://en.wikipedia.org/wiki/Apostrophe#Unicode |
![]() |
![]() |
![]() |
#132 |
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
|
Hi all,
I implemented the output to Kobo format into Penelope, my dictionary-converter tool. Source code: http://code.google.com/p/penelope-dictionary-converter/ Doc/explanation: http://www.albertopettarin.it/penelope.html Note that you will probably need Python 2.6+ (not Python 3.x) to run it. Example: Code:
$ python penelope.py --output-kobo -p bar -f en -t it Example 2: Code:
$ python penelope.py --xml --output-kobo -p bar -f en -t en === NOTE 1: probably the management of 11.html entries is not completely correct right now. At the moment every key NOT starting as [a-z][a-z] (when lowercased) will go to 11.html. Better ideas? NOTE 2: you will need to modify MARISA_BUILD_PATH in penelope.py, pointing it to the directory containing a working build of MARISA. Last edited by AlPe; 12-08-2012 at 02:03 PM. |
![]() |
![]() |
![]() |
#133 |
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
|
=== DELETED, wrong post ===
|
![]() |
![]() |
![]() |
#134 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,178
Karma: 2431850
Join Date: Sep 2008
Device: IPad Mini 2 Retina
|
I suspect that 11.html is a "catch-all" file for entires that are not found elsewhere. I have some success with files using characters other than a-z. So I suggest you try it and see.
|
![]() |
![]() |
![]() |
#135 |
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
|
If 11.html is a "catch-all", then my current code is good.
Indeed, right now my code does the following: if the lower-cased version of a keyword starts with [a-z][a-z], then it appends that keyword (and its definition) to the corresponding file; otherwise, it appends it to 11.html. Example: Code:
argon -> ar.html yoga -> yo.html a- -> 11.html -meter -> 11.html o'clock -> 11.html For confirming these issues, I will experiment a bit, but now it is quite late... |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
What's file format of dictionary | mnjkl | Kobo Reader | 2 | 12-12-2011 08:48 AM |
Dictionary format | jgray | Sony Reader | 1 | 10-25-2010 09:52 AM |
English Thesaurus in the dictionary format | osnova | Amazon Kindle | 14 | 12-12-2009 06:42 PM |
Dictionary: what version? can it be in firmware? | jedix | Sony Reader Dev Corner | 7 | 12-05-2008 12:00 PM |
Webster dictionary in DEPReader format | abigail | Reading and Management | 0 | 08-10-2005 08:00 AM |