Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 04-14-2022, 09:17 AM   #1
njpig
Connoisseur
njpig began at the beginning.
 
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
How kindle matches a word is looked up in a dictionlary

Hello,

Today, I unpacked an embedded kindle dictionary via KindleUnpack,
found there was no idx:entry for the word "carefully", and there was also no idx:iform in the idx:entry of "careful".

I'm curious how the word “carefully" can be matched by the embedded kindle dictionary?


Here is the idx:entry for the word "careful" in the embedded kindle dictionary.
Code:
<idx:entry scriptable="yes">
  <idx:orth value="careful">
    <idx:infl>
      <idx:iform name="" value="carefuller" />
      <idx:iform name="" value="carefullest" />
    </idx:infl>
  </idx:orth>
  <i>adjective</i>
  <b>carefuler</b>; <b>carefullest</b>
  <div align="left">
    <b>1</b>
    : using or taking care <br />
    <b>2</b>
    : marked by solicitude, caution, or prudence
  </div>
  <div align="left"> <b>carefully</b>
    <i>adverb</i>
  </div>
</idx:entry>
I tried to create a simple dictionary by myself with a idx:entry above(packed it via PocketMobi creator). But the word "carefully" could not be matched by my dictionary.

Could anyone share any experiences on this?
njpig is offline   Reply With Quote
Old 04-14-2022, 10:43 AM   #2
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 6,496
Karma: 84420419
Join Date: Nov 2011
Location: Tampa Bay, Florida
Device: Kindles
When a dictionary is created much of the HTML markup is removed and converted into various indexes. Kindlegen attempts to undo that conversion but it does not handle every possible case.

I might be able to look into this in more detail if I can locate the specific dictionary that you are using. I took at look at the English language dictionaries on my Oasis 2. They are "Oxford Dictionary of English_B0053VMNYW" and "The New Oxford American Dictionary_B0053VMNY2". Neither has the same definition for "careful" as your dictionary.
jhowell is offline   Reply With Quote
Advert
Old 04-14-2022, 10:56 AM   #3
njpig
Connoisseur
njpig began at the beginning.
 
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
Quote:
Originally Posted by jhowell View Post
When a dictionary is created much of the HTML markup is removed and converted into various indexes. Kindlegen attempts to undo that conversion but it does not handle every possible case.

I might be able to look into this in more detail if I can locate the specific dictionary that you are using. I took at look at the English language dictionaries on my Oasis 2. They are "Oxford Dictionary of English_B0053VMNYW" and "The New Oxford American Dictionary_B0053VMNY2". Neither has the same definition for "careful" as your dictionary.
The dictionary I showed above is "Merriam-Webster's Dictionary and Thesaurus B000SF9O22"
njpig is offline   Reply With Quote
Old 04-14-2022, 11:27 AM   #4
njpig
Connoisseur
njpig began at the beginning.
 
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
Quote:
Originally Posted by jhowell View Post
I took at look at the English language dictionaries on my Oasis 2. They are "Oxford Dictionary of English_B0053VMNYW" and "The New Oxford American Dictionary_B0053VMNY2". Neither has the same definition for "careful" as your dictionary.
Try "registered" with your Oxford. This word is defined neither in idx:entry nor idx:iform, but it can be matched as "register".

If I created my own dictionary with "register" defined but without "registered" defined" in idx:entry or idx:inform, "registered" could not be matched as "register".
njpig is offline   Reply With Quote
Old 04-14-2022, 11:46 AM   #5
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,583
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by njpig View Post
I tried to create a simple dictionary by myself with a idx:entry above(packed it via PocketMobi creator).
Mobipocket Creator is outdated. Try KindeGen. If Kindle Previewer is installed, you can find it in:
Code:
C:\Users\<user>\AppData\Local\Amazon\Kindle Previewer 3\lib\fc\bin\kindlegen.exe
(<user> is your user name.)
Doitsu is offline   Reply With Quote
Advert
Old 04-14-2022, 05:47 PM   #6
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 6,496
Karma: 84420419
Join Date: Nov 2011
Location: Tampa Bay, Florida
Device: Kindles
Quote:
Originally Posted by njpig View Post
Try "registered" with your Oxford. This word is defined neither in idx:entry nor idx:iform, but it can be matched as "register".
I looked up "registered" on my Kindle in the Oxford Dictionary of English. As expected it found the entry for "register".

I used KindleUnpack to convert the dictionary to HTML and found the expected idx:orth element for "register" there along with the definition. There was no idx:iform for "registered" in that entry. In fact the HTML produced using KindleUnpack for that dictionary shows no idx:iform elements at all in the entire file, which is obviously incorrect.

Looking at the original MOBI file I see that it does have an Inflection Index, but it has entries in a format that is not supported by KindleUnpack. From that index I determined that the nonsense word "registerthest" is another iform for "register" in that dictionary. I was able to confirm it by doing a dictionary lookup for "registerthest" on my Kindle which showed the definition for "register".
jhowell is offline   Reply With Quote
Old 04-14-2022, 06:19 PM   #7
njpig
Connoisseur
njpig began at the beginning.
 
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
Quote:
Originally Posted by Doitsu View Post
Mobipocket Creator is outdated. Try KindeGen. If Kindle Previewer is installed, you can find it in:
Code:
C:\Users\<user>\AppData\Local\Amazon\Kindle Previewer 3\lib\fc\bin\kindlegen.exe
(<user> is your user name.)
I tried. It did not work.
njpig is offline   Reply With Quote
Old 04-14-2022, 06:34 PM   #8
njpig
Connoisseur
njpig began at the beginning.
 
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
Quote:
Originally Posted by jhowell View Post
I looked up "registered" on my Kindle in the Oxford Dictionary of English. As expected it found the entry for "register".

I used KindleUnpack to convert the dictionary to HTML and found the expected idx:orth element for "register" there along with the definition. There was no idx:iform for "registered" in that entry. In fact the HTML produced using KindleUnpack for that dictionary shows no idx:iform elements at all in the entire file, which is obviously incorrect.

Looking at the original MOBI file I see that it does have an Inflection Index, but it has entries in a format that is not supported by KindleUnpack. From that index I determined that the nonsense word "registerthest" is another iform for "register" in that dictionary. I was able to confirm it by doing a dictionary lookup for "registerthest" on my Kindle which showed the definition for "register".
Thanks a lot!

So I should report a bug for KindleUnpack(https://github.com/kevinhendricks/KindleUnpack), right?

BTW, how do you analyze MOBI file? Just open it with a editor in binary mode, then check the data according to the SPEC of MOBI format(https://wiki.mobileread.com/wiki/MOBI)?
njpig is offline   Reply With Quote
Old 04-14-2022, 07:33 PM   #9
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 6,496
Karma: 84420419
Join Date: Nov 2011
Location: Tampa Bay, Florida
Device: Kindles
Quote:
Originally Posted by njpig View Post
So I should report a bug for KindleUnpack(https://github.com/kevinhendricks/KindleUnpack), right?
That is up to you. I doubt that anything will happen unless you do it yourself. The error message produced by KindleUnpack for the Oxford dictionary is:

Parsing metaInflIndexData
Error: Dictionary uses obsolete inflection rule scheme which is not yet supported


Quote:
Originally Posted by njpig View Post
BTW, how do you analyze MOBI file? Just open it with a editor in binary mode, then check the data according to the SPEC of MOBI format(https://wiki.mobileread.com/wiki/MOBI)?
KindleUnpack has some options to dump data when run in CLI mode. There is still a lot of manual work involved. The documentation is incomplete and you will need to work some things out for yourself by trial and error.
jhowell is offline   Reply With Quote
Old 04-14-2022, 07:50 PM   #10
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 6,496
Karma: 84420419
Join Date: Nov 2011
Location: Tampa Bay, Florida
Device: Kindles
I was curious about the Merriam-Webster's Dictionary & Thesaurus so I went ahead an purchased it. As you found KindleUnpack does not produce anything for "carefully".

Looking at the MOBI file I see that the Orthographic Index does actually contain an entry for "carefully" that points to the entry for "careful". Apparently that is another situation that KindleUnpack does not handle.
jhowell is offline   Reply With Quote
Old 04-14-2022, 08:33 PM   #11
njpig
Connoisseur
njpig began at the beginning.
 
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
Quote:
Originally Posted by jhowell View Post
I was curious about the Merriam-Webster's Dictionary & Thesaurus so I went ahead an purchased it. As you found KindleUnpack does not produce anything for "carefully".

Looking at the MOBI file I see that the Orthographic Index does actually contain an entry for "carefully" that points to the entry for "careful". Apparently that is another situation that KindleUnpack does not handle.
Synonyms part of Merriam-Webster's Dictionary & Thesaurus is not showed in the popup window when lookup because this part is not included in the corresponding idx:entry. So I tried to move this part into idx:entry, and worked. However, after I regenerated the dictionary, this "carefully" issue occurred.

BTW, do you know what kind of inflection rules a kindle dictionary has? I just know "idx:iform" so far.
njpig is offline   Reply With Quote
Old 04-14-2022, 08:37 PM   #12
njpig
Connoisseur
njpig began at the beginning.
 
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
Quote:
Originally Posted by jhowell View Post
That is up to you. I doubt that anything will happen unless you do it yourself. The error message produced by KindleUnpack for the Oxford dictionary is:

Parsing metaInflIndexData
Error: Dictionary uses obsolete inflection rule scheme which is not yet supported




KindleUnpack has some options to dump data when run in CLI mode. There is still a lot of manual work involved. The documentation is incomplete and you will need to work some things out for yourself by trial and error.
Thanks for your comments!
njpig is offline   Reply With Quote
Old 04-14-2022, 09:57 PM   #13
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 6,496
Karma: 84420419
Join Date: Nov 2011
Location: Tampa Bay, Florida
Device: Kindles
Quote:
Originally Posted by njpig View Post
BTW, do you know what kind of inflection rules a kindle dictionary has? I just know "idx:iform" so far.
That is the only form that I have seen documented. There is at least one other undocumented method that applies simple rules like add “ed” and “ing” to every entry. I don’t know how that is activated.
jhowell is offline   Reply With Quote
Old 04-14-2022, 11:49 PM   #14
xxyzz
Evangelist
xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.
 
Posts: 411
Karma: 2666666
Join Date: Nov 2020
Device: none
Libmobi(https://github.com/bfabiszewski/libmobi) has better support of dictionaries.
xxyzz is offline   Reply With Quote
Old 04-19-2022, 05:21 AM   #15
njpig
Connoisseur
njpig began at the beginning.
 
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
Quote:
Originally Posted by xxyzz View Post
Libmobi(https://github.com/bfabiszewski/libmobi) has better support of dictionaries.
Thanks you for the information.
njpig is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Troubleshooting How kindle matches when a word is looked up in a dictionlary njpig Amazon Kindle 1 04-14-2022 09:24 AM
Though this looked familiar Kboland Kobo Tablets 0 01-07-2012 10:38 AM
Kitchen Confidential - BN matches Kindle Daily Deal 10/12/11 wandalynn Deals and Resources (No Self-Promotion or Affiliate Links) 0 10-12-2011 12:28 PM


All times are GMT -4. The time now is 07:13 PM.


MobileRead.com is a privately owned, operated and funded community.