Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Readers > More E-Book Readers > Bookeen

Notices

Reply
 
Thread Tools Search this Thread
Old 03-04-2012, 08:08 PM   #31
DuckieTigger
Wizard
DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.
 
DuckieTigger's Avatar
 
Posts: 4,742
Karma: 246906703
Join Date: Dec 2011
Location: USA
Device: Oasis 3, Oasis 2, PW3, PW1, KT
I wasn't trying to rush you at all just give you more options.

If you are using Wheezy you could even explore into packaging it up into a deb-package or have somebody else do it. Maybe they include it. Typing "sudo apt-get install Penelope" would sure beat any other method of installing it, especially since dependencies can be taken care of by the Debian system automatically.

But that depends what you are planning to do with it. Since Penelope is released under GPL already it makes only sense, imo.
DuckieTigger is offline   Reply With Quote
Old 03-05-2012, 03:41 AM   #32
AlPe
Digital Amanuensis
AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.
 
AlPe's Avatar
 
Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
I see your point, I'll think about it; however, since I'm a perfectionist, I want it to have a man page, etc. ... i.e., some more work to be done.

BTW, if you are running Debian, you just need to install python (with python-pysqlite2) to run Penelope.
AlPe is offline   Reply With Quote
Old 03-07-2012, 02:46 AM   #33
DuckieTigger
Wizard
DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.
 
DuckieTigger's Avatar
 
Posts: 4,742
Karma: 246906703
Join Date: Dec 2011
Location: USA
Device: Oasis 3, Oasis 2, PW3, PW1, KT
Got a few things about Penelope. I finally downloaded it and looked through your code. It appears to me that you hacked the XML support in afterwards, especially since you parse the file by hand reinventing the wheel. Is that what happened?

For stardict the custom parser might be useful, but for XML it is not. The parser is invoked after you already gobblesmacked the XML file apart. Any useful information there might have been other than key and def are gone. If you are capable to write your own custom parser, then you should be able to output a XML file with all information necessary for Penelope. E.g. synonyms as an optional part of an <entry>. I'll rewrite your read_from_xml_format() - maybe you will like it.

Also I do not yet quite understand what the difference between substititon and synonym is. Wouldn't it make sense to simply add synonyms to your global substitution list and let them be added at the end?
DuckieTigger is offline   Reply With Quote
Old 03-07-2012, 03:29 AM   #34
AlPe
Digital Amanuensis
AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.
 
AlPe's Avatar
 
Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
Quote:
Originally Posted by DuckieTigger View Post
Got a few things about Penelope. I finally downloaded it and looked through your code. It appears to me that you hacked the XML support in afterwards, especially since you parse the file by hand reinventing the wheel. Is that what happened?

For stardict the custom parser might be useful, but for XML it is not. The parser is invoked after you already gobblesmacked the XML file apart. Any useful information there might have been other than key and def are gone. If you are capable to write your own custom parser, then you should be able to output a XML file with all information necessary for Penelope. E.g. synonyms as an optional part of an <entry>. I'll rewrite your read_from_xml_format() - maybe you will like it.
Hi,

yes, the XML parser was added later. But its philosophy is clear: start from a set of (word, definition) and read it. This is generally the case when you unpack a MOBI dictionary, for example. In that case, the custom parser part is still useful. For example, I might decide to extract synonyms from the definitions (for example, if my definitions have a <p>SYN: ... </p>, but only for some words).

But I agree that, in general, one can code a more complex "XML" parser. I just wanted it to be quick and dirty, with the bare minimum needed by the remaining code, as I explain in the web page. But if you want to send me your code, I will look at it and integrate it in the tool.


Quote:
Originally Posted by DuckieTigger View Post
Also I do not yet quite understand what the difference between substititon and synonym is. Wouldn't it make sense to simply add synonyms to your global substitution list and let them be added at the end?
With reference to this data format:

Code:
[ word, include, synonyms, substitutions, definition ]
the difference is that synonyms are associated to the current word, and extracted while parsing the definition of word, while substitutions are pairs (word, substitute_with_some_other_word).

The nect effect of synonyms is that an entry is created for word, both in the index and in the definition files, while for each synonym only the index entry is created, pointing at the same definition of "word".

The nect effect of substitutions is that an entry is created for "word", pointing at the definition of "substitute_with_some_other_word".

The functional difference is that substitutions can be done only after all the definitions are actually written on disk, that's why they are accumulated and processed together at the end. On the other hand, synonyms can be inserted in the index immediately.

Three motivating examples for this strategy.

1) When you parse stuff like a wiktionary, where you have lots of pages in the form "mice is the plural of mouse". (Two pages: "Mice" and "Mouse")
If you don't want to create a definition of "mice", but still have the definition of "mouse" displayed, when processing "mice" you can set up a substitution: the dictionary will not contain an entry for "mice" but when you select "mice" on your document you will get the definition for "mouse". But since you will encounter "mice" before "mouse", you do not know at which position in the definition file you must make your "mice" index record point at. So, you will use a substitution in this case.
(Note that my code does not check that a definition for "Mouse" actually exists)

2) Another example occurs quite often in Italian, where adjectives have suffixes for masculine/feminine and singular/plural (amico, amica, amici, amiche are the four adjectives corresponding to friend). Usually in the dictionary you will find only the masculine singular (amico). But you might want all the four versions to point at the same definition: (amico, amica, amici, amiche -> amico), without having defs for "amica", "amici", "amiche".

3) Sometimes a word has more than one spelling. Again, this is particularily true in Italian, where ancient spellings co-exhist with modern ones (say, "abbazia" and "abbadia" for "abbey"). Usually you will find listed in the dictionary only the modern term, but in its definition you will find something like "ANCIENT SPELLING: ...". In this case, you parse the definition of "abbazia", find out that there is also the ancient spelling "abbadia", and add it to the "abbazia" tuple as a synonym. Doing so will create two entries in the index (one for "abbazia", one for "abbadia") pointing at the same definition.

Last edited by AlPe; 03-07-2012 at 03:33 AM.
AlPe is offline   Reply With Quote
Old 03-07-2012, 03:51 AM   #35
DuckieTigger
Wizard
DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.
 
DuckieTigger's Avatar
 
Posts: 4,742
Karma: 246906703
Join Date: Dec 2011
Location: USA
Device: Oasis 3, Oasis 2, PW3, PW1, KT
Ahh thank you, that makes sense especially with Italien beeing so weird. By the way, you do check if Mouse exists when you deal with substitutions:
Code:
if sub_to in global_dictionary:
            sql_tuple = global_dictionary[sub_to]
            sql_tuple = ( sql_tuple[0], sub_from,    s[sql_tuple[2], sql_tuple[3], sql_tuple[4] )
DuckieTigger is offline   Reply With Quote
Old 03-07-2012, 03:53 AM   #36
AlPe
Digital Amanuensis
AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.
 
AlPe's Avatar
 
Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
Ok, great, I should have fixed that thing at a certain point, but lost memory of it!

Still, if you define a substitution, but the sub_to does not exist, you loose the sub_from stuff, I guess.
AlPe is offline   Reply With Quote
Old 03-07-2012, 09:42 AM   #37
AlPe
Digital Amanuensis
AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.
 
AlPe's Avatar
 
Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
Hi,

I have just opened the following project at Google Code:

http://code.google.com/p/penelope-dictionary-converter/

If you want to contribute some code, please let me know via email or PM, I will add you at the project.
AlPe is offline   Reply With Quote
Old 03-07-2012, 02:00 PM   #38
DuckieTigger
Wizard
DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.
 
DuckieTigger's Avatar
 
Posts: 4,742
Karma: 246906703
Join Date: Dec 2011
Location: USA
Device: Oasis 3, Oasis 2, PW3, PW1, KT
Quote:
Originally Posted by AlPe View Post
Hi,

I have just opened the following project at Google Code:

http://code.google.com/p/penelope-dictionary-converter/

If you want to contribute some code, please let me know via email or PM, I will add you at the project.
Awesome, soon as I got some code that is representable, I will drop you a PM. Sounds like a good idea.
DuckieTigger is offline   Reply With Quote
Old 03-13-2012, 10:34 AM   #39
AlPe
Digital Amanuensis
AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.
 
AlPe's Avatar
 
Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
I have just pushed a new version of Penelope that allows you to output in StarDict format.

This is useful if you want to convert an XML dictionary into StarDict, or you want to write your own parser to manipulate an existing StarDict dictionary.

You can find the code at the Google Code project:

http://code.google.com/p/penelope-dictionary-converter/

and the "manual" here:

http://www.albertopettarin.it/penelope.html

(eventually, it will be moved to the wiki of the Google Code project).

Last edited by AlPe; 12-11-2012 at 03:02 PM. Reason: Updated the link to my homepage
AlPe is offline   Reply With Quote
Old 03-21-2012, 10:41 PM   #40
DuckieTigger
Wizard
DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.DuckieTigger ought to be getting tired of karma fortunes by now.
 
DuckieTigger's Avatar
 
Posts: 4,742
Karma: 246906703
Join Date: Dec 2011
Location: USA
Device: Oasis 3, Oasis 2, PW3, PW1, KT
Quote:
Originally Posted by AlPe View Post
I have just pushed a new version of Penelope that allows you to output in StarDict format.

This is useful if you want to convert an XML dictionary into StarDict, or you want to write your own parser to manipulate an existing StarDict dictionary.

You can find the code at the Google Code project:

http://code.google.com/p/penelope-dictionary-converter/

and the "manual" here:

http://www.dei.unipd.it/~pettarin/penelope.html

(eventually, it will be moved to the wiki of the Google Code project).
Nice job AlPe on getting the outSD finished and pushed. I am in the process of moving the manual over to GoogleCode. It is closer to a rewrite as the WikiSyntax is not really compatible with html, and Google discourages mixing html and WikiSyntax.

At least once it is all moved it has the big advantage of beeing a Mercurial repository with full featured revision control and ability to commit independent from you.

Would appreciate some feedback before I got everything moved only to find out that I need to redo it all for some strange reformating
DuckieTigger is offline   Reply With Quote
Old 05-21-2012, 05:26 AM   #41
AlPe
Digital Amanuensis
AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.
 
AlPe's Avatar
 
Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
Since Bookeen has not implemented a "search in dictionary" function yet, I attach to this post a sample EPUB file that might be used to "emulate" the search function.

It contains the list of words, grouped by starting letter and 4-letters prefix, so that one can "navigate" through the EPUB, find the desired word, select it and have the dictionary definition displayed.

I attach the EPUB generated for the ENGLISH dictionary that I helped developing for SBF, based on the GNU Collaborative International Dictionary of English (despite being named "en.wikipedia.dict"). You can get it here:

DICT.IDX: http://bit.ly/wfCrcK
DICT: http://bit.ly/Ao8JjI

Usage (looking for "yerd"):

1) Open dictionary.v1.epub on the Odyssey, as any other EPUB file.
2) Select "Letter Y".
3) Select "YEAS - YIN" (as "yerd" falls into this interval).
4) Locate "yerd" and select it.
5) The dictionary should pop the definition of "yerd" up.

This test EPUB was generated by a Python script that will be merged into Penelope project, time permitting. Let me know what you think about it or should you have any suggestions/comments.
Attached Files
File Type: epub dictionary.v1.epub (890.5 KB, 567 views)
AlPe is offline   Reply With Quote
Old 05-25-2012, 05:32 PM   #42
AlPe
Digital Amanuensis
AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.
 
AlPe's Avatar
 
Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
Any comments or suggestions from those six who downloaded the EPUB file?

I am planning to release the "final version" of the EPUB "dictionary" next week, I would be very grateful if you can post here your thoughts, thanks!
AlPe is offline   Reply With Quote
Old 08-10-2012, 10:27 AM   #43
angelos_cy
Junior Member
angelos_cy began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Aug 2012
Device: Sony Reader PRS-650, CyBook Odyssey, PocketBook Touch 622
EL-EN requested

Quote:
Originally Posted by AlPe View Post
Hi guys,

As I am working on the script to convert a stardict dictionary to the Cybook Odyssey format, I would like to test the conversion of several dictionaries.

To get two doves with a stone, please PM me a pointer to a freeware/GPL'ed/etc. (preferably stardict) dictionary you would like to have on your Cybook Odyssey.

Once the conversion goes well, I will give the converted dictionary to you.

You can also leave a reply here saying "EN -> IT requested" and the like.

Thanks.
Could I please have the attached dictionary converted to CyBook Odyssey format?
Thanks in advance!
Attached Files
File Type: bz2 Babylon Greek English.tar.bz2 (1.35 MB, 511 views)
angelos_cy is offline   Reply With Quote
Old 08-20-2012, 06:13 AM   #44
AlPe
Digital Amanuensis
AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.
 
AlPe's Avatar
 
Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
Quote:
Originally Posted by angelos_cy View Post
Could I please have the attached dictionary converted to CyBook Odyssey format?
Thanks in advance!
Done. Since we are not sure that the dictionary can be redistributed, I refrain from posting it. You can ask for it via PM, though.
AlPe is offline   Reply With Quote
Old 08-22-2012, 09:34 AM   #45
angelos_cy
Junior Member
angelos_cy began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Aug 2012
Device: Sony Reader PRS-650, CyBook Odyssey, PocketBook Touch 622
Quote:
Originally Posted by AlPe View Post
Done. Since we are not sure that the dictionary can be redistributed, I refrain from posting it. You can ask for it via PM, though.
Many thanks!
angelos_cy is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
dictionaries shleepy Bookeen 29 12-14-2013 11:15 AM
Dictionaries under 2.1.0. jshzh PocketBook 11 01-13-2012 04:53 AM
Just got K3 and need some help with 3G and dictionaries... pollo Amazon Kindle 1 12-29-2011 05:13 PM
Android Dictionaries obsessed2 enTourage Archive 0 05-01-2011 11:44 AM
Can anybody tell me about dictionaries? andym Workshop 0 09-26-2007 03:32 AM


All times are GMT -4. The time now is 07:23 PM.


MobileRead.com is a privately owned, operated and funded community.