|
|
#136 |
|
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 413
Karma: 111714
Join Date: Jun 2012
Device: kobo touch
|
|
|
|
|
|
|
#137 |
|
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 413
Karma: 111714
Join Date: Jun 2012
Device: kobo touch
|
|
|
|
|
|
Enthusiast
|
|
|
|
#138 |
|
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 478
Karma: 1167803
Join Date: Dec 2011
Location: Padova, Italy
Device: Kindle3, Odyssey, eDGe, A60, PRS-T1, iPad3, KoboGlo
|
Do not take that expression too literally: it was a short-hand for "every word (lowercased) NOT starting with two letters --- in the range a-z. My code actually generates other files other than 11.html
__________________
Personal Website (EN): http://www.albertopettarin.it/ (My) Company Website (EN/IT): http://www.smuuks.it/ Coordinator Member of eBookClub Italia: http://ebci.it/ |
|
|
|
|
|
#139 |
|
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 413
Karma: 111714
Join Date: Jun 2012
Device: kobo touch
|
To be more clear, I am thinking of languages that do not use Latin script.
|
|
|
|
|
|
#140 |
|
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 478
Karma: 1167803
Join Date: Dec 2011
Location: Padova, Italy
Device: Kindle3, Odyssey, eDGe, A60, PRS-T1, iPad3, KoboGlo
|
I tried this dictionary:
Code:
<?xml version = "1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE document SYSTEM "dictionary.dtd"> <dictionary> <entry> <key>Würstel</key> <def>Definition of <i>Würstel</i>, with upper case letter.</def> </entry> <entry> <key>würstel</key> <def>Definition of <i>würstel</i>, with lower case letter.</def> </entry> <entry> <key>1826</key> <def>Test of <i>1826</i> one-eight-two-six.</def> </entry> <entry> <key>è</key> <def>"è" is the Italian equivalent of "is".</def> </entry> <entry> <key>Leopardi</key> <def><i>Leopardi</i> is the name of a famous Italian poet.</def> </entry> <entry> <key>leopardi</key> <def><i>leopardi</i> means "leopards" in Italian.</def> </entry> <entry> <key>égloga</key> <def>Type of ancient poem.</def> </entry> </dictionary> Apparently, searching for "1826" retrieves its definition, while "égloga" or " würstel" (they went into 11.html) does not --- this could mean that the Kobo expects them to be in ég.html and wü.html files. So far so good, but I am puzzled by the fact that also "è" does not retrieve its definition, albeit being in 11.html.
__________________
Personal Website (EN): http://www.albertopettarin.it/ (My) Company Website (EN/IT): http://www.smuuks.it/ Coordinator Member of eBookClub Italia: http://ebci.it/ Last edited by AlPe; 12-08-2012 at 04:27 PM. |
|
|
|
|
|
#141 |
|
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 478
Karma: 1167803
Join Date: Dec 2011
Location: Padova, Italy
Device: Kindle3, Odyssey, eDGe, A60, PRS-T1, iPad3, KoboGlo
|
Yes, that's sure. How should this be dealt with?
__________________
Personal Website (EN): http://www.albertopettarin.it/ (My) Company Website (EN/IT): http://www.smuuks.it/ Coordinator Member of eBookClub Italia: http://ebci.it/ |
|
|
|
|
|
#142 |
|
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 413
Karma: 111714
Join Date: Jun 2012
Device: kobo touch
|
|
|
|
|
|
|
#143 |
|
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 413
Karma: 111714
Join Date: Jun 2012
Device: kobo touch
|
|
|
|
|
|
|
#144 |
|
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 478
Karma: 1167803
Join Date: Dec 2011
Location: Padova, Italy
Device: Kindle3, Odyssey, eDGe, A60, PRS-T1, iPad3, KoboGlo
|
You are right, I tried and I confirm it.
__________________
Personal Website (EN): http://www.albertopettarin.it/ (My) Company Website (EN/IT): http://www.smuuks.it/ Coordinator Member of eBookClub Italia: http://ebci.it/ |
|
|
|
|
|
#145 |
|
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 478
Karma: 1167803
Join Date: Dec 2011
Location: Padova, Italy
Device: Kindle3, Odyssey, eDGe, A60, PRS-T1, iPad3, KoboGlo
|
Seems legit. I will try to check tomorrow. Thanks for the lead.
__________________
Personal Website (EN): http://www.albertopettarin.it/ (My) Company Website (EN/IT): http://www.smuuks.it/ Coordinator Member of eBookClub Italia: http://ebci.it/ |
|
|
|
|
|
#146 |
|
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 69
Karma: 497999
Join Date: Nov 2012
Device: kobo
|
@Alpe, you want that I you send my couple of letter French dictionary for your test?
|
|
|
|
|
|
#147 |
|
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 478
Karma: 1167803
Join Date: Dec 2011
Location: Padova, Italy
Device: Kindle3, Odyssey, eDGe, A60, PRS-T1, iPad3, KoboGlo
|
Yes, please, especially if you have already experimented on which keywords should go to which XX.html file.
__________________
Personal Website (EN): http://www.albertopettarin.it/ (My) Company Website (EN/IT): http://www.smuuks.it/ Coordinator Member of eBookClub Italia: http://ebci.it/ |
|
|
|
|
|
#148 |
|
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 478
Karma: 1167803
Join Date: Dec 2011
Location: Padova, Italy
Device: Kindle3, Odyssey, eDGe, A60, PRS-T1, iPad3, KoboGlo
|
Ok, I have performed a test of ASCII (0-127) characters, except control characters, and it seems that you can move all keywords containing a non-letter (defined as [A-Za-z]) in the first two characters into 11.html.
I attach the XML dictionary I used for testing, plus the "Italian" Kobo dictionary compiled by Penelope from the XML dictionary. (Note: download the XML file and open it with a text editor, since it might not display properly in a browser, since it has raw "< > &". Penelope does not use a DOM parsers, hence it allows those characters to be unescaped.) I updated the Google Code source code of Penelope, to reflect tshering's suggestion (thanks!) about 1-character keywords.
__________________
Personal Website (EN): http://www.albertopettarin.it/ (My) Company Website (EN/IT): http://www.smuuks.it/ Coordinator Member of eBookClub Italia: http://ebci.it/ Last edited by AlPe; 12-09-2012 at 09:40 AM. Reason: Added note about the XML file |
|
|
|
|
|
#149 |
|
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 69
Karma: 497999
Join Date: Nov 2012
Device: kobo
|
@Alpe I you send a mail.
|
|
|
|
|
|
#150 |
|
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 413
Karma: 111714
Join Date: Jun 2012
Device: kobo touch
|
In order to work also with languages that use other scripts than Latin script, e.g. Greek, it would be preferable that your script sends to 11.html only those keywords that must be send to 11.html, rather that all those keywords that can be send to 11.htrml. This should be only a small change in your script but have a great effect on the usability.
|
|
|
|
![]() |
| Thread Tools | Search this Thread |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| What's file format of dictionary | mnjkl | Kobo Reader | 2 | 12-12-2011 08:48 AM |
| Dictionary format | jgray | Sony Reader | 1 | 10-25-2010 09:52 AM |
| English Thesaurus in the dictionary format | osnova | Amazon Kindle | 14 | 12-12-2009 06:42 PM |
| Dictionary: what version? can it be in firmware? | jedix | Sony Reader Dev Corner | 7 | 12-05-2008 12:00 PM |
| Webster dictionary in DEPReader format | abigail | Reading and Management | 0 | 08-10-2005 08:00 AM |