![]() |
#136 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
I am not sure what exactly is the "idea" behind the 11.html. As, e.g., the case of "o'clock" shows (it works in 11.html and in o'.html) the rules are not so clear-cut. To my mind the rule should be: Put into 11.html all cases that otherwise would result in an invalid filename (on any? OS), period not being accepted as part of a filename proper. I did however not test whether the search engine finds for instance "3rd" in 3r.html.
|
![]() |
![]() |
![]() |
#137 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
|
![]() |
![]() |
Advert | |
|
![]() |
#138 |
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
|
|
![]() |
![]() |
![]() |
#139 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
To be more clear, I am thinking of languages that do not use Latin script.
|
![]() |
![]() |
![]() |
#140 |
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
|
I tried this dictionary:
Code:
<?xml version = "1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE document SYSTEM "dictionary.dtd"> <dictionary> <entry> <key>Würstel</key> <def>Definition of <i>Würstel</i>, with upper case letter.</def> </entry> <entry> <key>würstel</key> <def>Definition of <i>würstel</i>, with lower case letter.</def> </entry> <entry> <key>1826</key> <def>Test of <i>1826</i> one-eight-two-six.</def> </entry> <entry> <key>è</key> <def>"è" is the Italian equivalent of "is".</def> </entry> <entry> <key>Leopardi</key> <def><i>Leopardi</i> is the name of a famous Italian poet.</def> </entry> <entry> <key>leopardi</key> <def><i>leopardi</i> means "leopards" in Italian.</def> </entry> <entry> <key>égloga</key> <def>Type of ancient poem.</def> </entry> </dictionary> Apparently, searching for "1826" retrieves its definition, while "égloga" or " würstel" (they went into 11.html) does not --- this could mean that the Kobo expects them to be in ég.html and wü.html files. So far so good, but I am puzzled by the fact that also "è" does not retrieve its definition, albeit being in 11.html. Last edited by AlPe; 12-08-2012 at 04:27 PM. |
![]() |
![]() |
Advert | |
|
![]() |
#141 |
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
|
|
![]() |
![]() |
![]() |
#142 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
|
![]() |
![]() |
![]() |
#143 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
|
![]() |
![]() |
![]() |
#144 |
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
|
|
![]() |
![]() |
![]() |
#145 |
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
|
|
![]() |
![]() |
![]() |
#146 |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 86
Karma: 546021
Join Date: Nov 2012
Device: kobo
|
@Alpe, you want that I you send my couple of letter French dictionary for your test?
|
![]() |
![]() |
![]() |
#147 |
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
|
Yes, please, especially if you have already experimented on which keywords should go to which XX.html file.
|
![]() |
![]() |
![]() |
#148 |
Digital Amanuensis
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
|
Ok, I have performed a test of ASCII (0-127) characters, except control characters, and it seems that you can move all keywords containing a non-letter (defined as [A-Za-z]) in the first two characters into 11.html.
I attach the XML dictionary I used for testing, plus the "Italian" Kobo dictionary compiled by Penelope from the XML dictionary. (Note: download the XML file and open it with a text editor, since it might not display properly in a browser, since it has raw "< > &". Penelope does not use a DOM parsers, hence it allows those characters to be unescaped.) I updated the Google Code source code of Penelope, to reflect tshering's suggestion (thanks!) about 1-character keywords. Last edited by AlPe; 12-09-2012 at 09:40 AM. Reason: Added note about the XML file |
![]() |
![]() |
![]() |
#149 |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 86
Karma: 546021
Join Date: Nov 2012
Device: kobo
|
@Alpe I you send a mail.
|
![]() |
![]() |
![]() |
#150 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
|
In order to work also with languages that use other scripts than Latin script, e.g. Greek, it would be preferable that your script sends to 11.html only those keywords that must be send to 11.html, rather that all those keywords that can be send to 11.htrml. This should be only a small change in your script but have a great effect on the usability.
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
What's file format of dictionary | mnjkl | Kobo Reader | 2 | 12-12-2011 08:48 AM |
Dictionary format | jgray | Sony Reader | 1 | 10-25-2010 09:52 AM |
English Thesaurus in the dictionary format | osnova | Amazon Kindle | 14 | 12-12-2009 06:42 PM |
Dictionary: what version? can it be in firmware? | jedix | Sony Reader Dev Corner | 7 | 12-05-2008 12:00 PM |
Webster dictionary in DEPReader format | abigail | Reading and Management | 0 | 08-10-2005 08:00 AM |