04-14-2022, 09:17 AM | #1 |
Connoisseur
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
|
How kindle matches a word is looked up in a dictionlary
Hello,
Today, I unpacked an embedded kindle dictionary via KindleUnpack, found there was no idx:entry for the word "carefully", and there was also no idx:iform in the idx:entry of "careful". I'm curious how the word “carefully" can be matched by the embedded kindle dictionary? Here is the idx:entry for the word "careful" in the embedded kindle dictionary. Code:
<idx:entry scriptable="yes"> <idx:orth value="careful"> <idx:infl> <idx:iform name="" value="carefuller" /> <idx:iform name="" value="carefullest" /> </idx:infl> </idx:orth> <i>adjective</i> <b>carefuler</b>; <b>carefullest</b> <div align="left"> <b>1</b> : using or taking care <br /> <b>2</b> : marked by solicitude, caution, or prudence </div> <div align="left"> <b>carefully</b> <i>adverb</i> </div> </idx:entry> Could anyone share any experiences on this? |
04-14-2022, 10:43 AM | #2 |
Grand Sorcerer
Posts: 6,496
Karma: 84420419
Join Date: Nov 2011
Location: Tampa Bay, Florida
Device: Kindles
|
When a dictionary is created much of the HTML markup is removed and converted into various indexes. Kindlegen attempts to undo that conversion but it does not handle every possible case.
I might be able to look into this in more detail if I can locate the specific dictionary that you are using. I took at look at the English language dictionaries on my Oasis 2. They are "Oxford Dictionary of English_B0053VMNYW" and "The New Oxford American Dictionary_B0053VMNY2". Neither has the same definition for "careful" as your dictionary. |
Advert | |
|
04-14-2022, 10:56 AM | #3 | |
Connoisseur
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
|
Quote:
|
|
04-14-2022, 11:27 AM | #4 | |
Connoisseur
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
|
Quote:
If I created my own dictionary with "register" defined but without "registered" defined" in idx:entry or idx:inform, "registered" could not be matched as "register". |
|
04-14-2022, 11:46 AM | #5 | |
Grand Sorcerer
Posts: 5,583
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
Code:
C:\Users\<user>\AppData\Local\Amazon\Kindle Previewer 3\lib\fc\bin\kindlegen.exe
|
|
Advert | |
|
04-14-2022, 05:47 PM | #6 | |
Grand Sorcerer
Posts: 6,496
Karma: 84420419
Join Date: Nov 2011
Location: Tampa Bay, Florida
Device: Kindles
|
Quote:
I used KindleUnpack to convert the dictionary to HTML and found the expected idx:orth element for "register" there along with the definition. There was no idx:iform for "registered" in that entry. In fact the HTML produced using KindleUnpack for that dictionary shows no idx:iform elements at all in the entire file, which is obviously incorrect. Looking at the original MOBI file I see that it does have an Inflection Index, but it has entries in a format that is not supported by KindleUnpack. From that index I determined that the nonsense word "registerthest" is another iform for "register" in that dictionary. I was able to confirm it by doing a dictionary lookup for "registerthest" on my Kindle which showed the definition for "register". |
|
04-14-2022, 06:19 PM | #7 |
Connoisseur
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
|
|
04-14-2022, 06:34 PM | #8 | |
Connoisseur
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
|
Quote:
So I should report a bug for KindleUnpack(https://github.com/kevinhendricks/KindleUnpack), right? BTW, how do you analyze MOBI file? Just open it with a editor in binary mode, then check the data according to the SPEC of MOBI format(https://wiki.mobileread.com/wiki/MOBI)? |
|
04-14-2022, 07:33 PM | #9 | ||
Grand Sorcerer
Posts: 6,496
Karma: 84420419
Join Date: Nov 2011
Location: Tampa Bay, Florida
Device: Kindles
|
Quote:
Parsing metaInflIndexData Error: Dictionary uses obsolete inflection rule scheme which is not yet supported Quote:
|
||
04-14-2022, 07:50 PM | #10 |
Grand Sorcerer
Posts: 6,496
Karma: 84420419
Join Date: Nov 2011
Location: Tampa Bay, Florida
Device: Kindles
|
I was curious about the Merriam-Webster's Dictionary & Thesaurus so I went ahead an purchased it. As you found KindleUnpack does not produce anything for "carefully".
Looking at the MOBI file I see that the Orthographic Index does actually contain an entry for "carefully" that points to the entry for "careful". Apparently that is another situation that KindleUnpack does not handle. |
04-14-2022, 08:33 PM | #11 | |
Connoisseur
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
|
Quote:
BTW, do you know what kind of inflection rules a kindle dictionary has? I just know "idx:iform" so far. |
|
04-14-2022, 08:37 PM | #12 | |
Connoisseur
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
|
Quote:
|
|
04-14-2022, 09:57 PM | #13 |
Grand Sorcerer
Posts: 6,496
Karma: 84420419
Join Date: Nov 2011
Location: Tampa Bay, Florida
Device: Kindles
|
That is the only form that I have seen documented. There is at least one other undocumented method that applies simple rules like add “ed” and “ing” to every entry. I don’t know how that is activated.
|
04-14-2022, 11:49 PM | #14 |
Evangelist
Posts: 411
Karma: 2666666
Join Date: Nov 2020
Device: none
|
Libmobi(https://github.com/bfabiszewski/libmobi) has better support of dictionaries.
|
04-19-2022, 05:21 AM | #15 | |
Connoisseur
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
|
Quote:
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Troubleshooting How kindle matches when a word is looked up in a dictionlary | njpig | Amazon Kindle | 1 | 04-14-2022 09:24 AM |
Though this looked familiar | Kboland | Kobo Tablets | 0 | 01-07-2012 10:38 AM |
Kitchen Confidential - BN matches Kindle Daily Deal 10/12/11 | wandalynn | Deals and Resources (No Self-Promotion or Affiliate Links) | 0 | 10-12-2011 12:28 PM |