Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Readers > Amazon Kindle

Notices

Reply
 
Thread Tools Search this Thread
Old 04-28-2020, 05:16 PM   #1
geotadams
Junior Member
geotadams began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Apr 2020
Device: Kindle Basic
Dealing with homonyms/multiple words with the same inflection?

I'm currently working on a Slovak-German dictionary for personal use, and I've come across a confusing problem. There are certain entries from the dict.cc data which are effectively the same word, or are at least spelled the same way, but have, for example, very different meanings depending on context.

What I want is for the dictionary to pull up multiple different entries when one of these homonyms is selected. I know this is somehow possible, as it's been very effectively implemented in Duden's native monolingual German dictionary. If you select one of these words with multiple meanings, taking "es" as an example, it will bring up several tabs which can be navigated left to right with the arrows. In my dictionary, this does not seem to work. For example, the word "cítiť," meaning "to feel." When, for example, the past tense, cítil, is selected, I would like the dictionary to pull up the entries for cítiť, cítiť sa (reflexive) and cítiť in the sense of to smell. But right now, it only pulls up the reflexive option, the last in the list.

Is there a way for the average user to implement what Duden was able to do, or am I going to just have to put all the possible definitions in one entry?
geotadams is offline   Reply With Quote
Old 04-28-2020, 06:28 PM   #2
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 6,497
Karma: 84420419
Join Date: Nov 2011
Location: Tampa Bay, Florida
Device: Kindles
Perhaps you could unpack the Duden dictionary using Kindleunpack and see how it is coded. (I don't know whether or not DRM would prevent that.)
jhowell is online now   Reply With Quote
Advert
Old 04-28-2020, 11:34 PM   #3
geotadams
Junior Member
geotadams began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Apr 2020
Device: Kindle Basic
Quote:
Originally Posted by jhowell View Post
Perhaps you could unpack the Duden dictionary using Kindleunpack and see how it is coded. (I don't know whether or not DRM would prevent that.)
That was my first thought too, unfortunately when I was attempting to do that it was taking hours on end and stopping at 40% so I had to give up.
geotadams is offline   Reply With Quote
Old 04-29-2020, 03:50 AM   #4
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 11,164
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
In English sometimes they are pronounced the same or differently.
Bear [1] : Animal
Bear [2] : Carry

Lead [1] : A pliable heavy metal once used in solder and roofing
Lead [2] : To manage a group of people
Lead [3] : A leash or strap used to control or manage an animal or small child
Lead [4] : The writing core of a pencil. Actual metallic lead[1] was never used, but graphite mixed with clay.

Broach, Brooch, led and lead[1] are homonyms but not a problem as the spelling differs. Lead[1] and Lead[2] are pronounced quite differently.

I thought all the dictionaries had multiple entries when one spelling has several meanings. The homonym aspect isn't relevant on source text highlight and look up?

What about common phrases?
Quoth is offline   Reply With Quote
Old 04-29-2020, 08:59 AM   #5
geotadams
Junior Member
geotadams began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Apr 2020
Device: Kindle Basic
Quote:
Originally Posted by Quoth View Post
In English sometimes they are pronounced the same or differently.
Bear [1] : Animal
Bear [2] : Carry

Lead [1] : A pliable heavy metal once used in solder and roofing
Lead [2] : To manage a group of people
Lead [3] : A leash or strap used to control or manage an animal or small child
Lead [4] : The writing core of a pencil. Actual metallic lead[1] was never used, but graphite mixed with clay.

Broach, Brooch, led and lead[1] are homonyms but not a problem as the spelling differs. Lead[1] and Lead[2] are pronounced quite differently.

I thought all the dictionaries had multiple entries when one spelling has several meanings. The homonym aspect isn't relevant on source text highlight and look up?

What about common phrases?
They do normally, but I seem to have trouble actually implementing this in practice.
geotadams is offline   Reply With Quote
Advert
Old 04-29-2020, 12:06 PM   #6
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 11,164
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
Quote:
Originally Posted by geotadams View Post
They do normally, but I seem to have trouble actually implementing this in practice.
Isn't it simply really one entry for all the words of the same spelling but looks like (formatted) multiple words?
<index anchor or and word>[1]<pronunciation><meaning><newline>
<same word>[2]<pronunciation><meaning><newline>
<same word>[3]<pronunciation><meaning>

I.e. you just do a normal entry but with more text and formatting as above. The Dictionary search of the highlighted word is purely using spelling. There is no context or AI, so with "lead" you have a single entry with all the meanings numbered.
Words with actually different spelling are no problem and the fact that some sound the same is irrelevant.

So maybe you are just "overthinking" it. What "seems" like four enumerated entries actually has to be only one entry as the lookup can never ever know which is needed.

I'll believe AI is more than marketing speak for a special kind of database and that "machine learning" is more than human curated/validated storing of input data to later match when we have spelling checkers and grammar checkers much better than the 1980s. So far we still have not a single AI example.
Quoth is offline   Reply With Quote
Old 04-29-2020, 02:03 PM   #7
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by geotadams View Post
That was my first thought too, unfortunately when I was attempting to do that it was taking hours on end and stopping at 40% so I had to give up.
Have you tried the KindleUnpack Calibre plugin?
It typically takes less than a minute to unpack a .mobi dictionary. Unfortunately, the plugin can't reverse-engineer the inflections of all dictionaries. The Duden is one of those books. Try unpacking one of the bilingual Oxford dictionaries instead.

Quote:
Originally Posted by geotadams View Post
Is there a way for the average user to implement what Duden was able to do, or am I going to just have to put all the possible definitions in one entry?
You'll have to define inflections for each entry. Here's an example from the Kindle Publishing Guidelines.

Code:
<idx:entry name="english" scriptable="yes" spell="yes">
    <idx:short><a id="1"></a>
        <idx:orth value="aardvark">
            <b>aard•vark</b>
            <idx:infl>
                <idx:iform value="aardvarks"></idx:iform>
                <idx:iform value="aardvark’s"></idx:iform>
                <idx:iform value="aardvarks’"></idx:iform>
            </idx:infl>
        </idx:orth>
        <p> A nocturnal burrowing mammal native to sub-Saharan Africa that feeds exclusively on ants and termites.</p>
    </idx:short>
</idx:entry>

Last edited by Doitsu; 04-29-2020 at 02:31 PM.
Doitsu is offline   Reply With Quote
Old 04-29-2020, 04:15 PM   #8
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 11,164
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
Quote:
Originally Posted by Doitsu View Post
You'll have to define inflections for each entry.
Again, given the complexity and variation of languages, any automatic system would be poor. So that makes sense. I have to do that with my libreoffice writer custom dictionaries.
Quoth is offline   Reply With Quote
Old 04-29-2020, 09:03 PM   #9
geotadams
Junior Member
geotadams began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Apr 2020
Device: Kindle Basic
Quote:
Originally Posted by Doitsu View Post
Have you tried the KindleUnpack Calibre plugin?
It typically takes less than a minute to unpack a .mobi dictionary. Unfortunately, the plugin can't reverse-engineer the inflections of all dictionaries. The Duden is one of those books. Try unpacking one of the bilingual Oxford dictionaries instead.



You'll have to define inflections for each entry. Here's an example from the Kindle Publishing Guidelines.

Code:
<idx:entry name="english" scriptable="yes" spell="yes">
    <idx:short><a id="1"></a>
        <idx:orth value="aardvark">
            <b>aard•vark</b>
            <idx:infl>
                <idx:iform value="aardvarks"></idx:iform>
                <idx:iform value="aardvark’s"></idx:iform>
                <idx:iform value="aardvarks’"></idx:iform>
            </idx:infl>
        </idx:orth>
        <p> A nocturnal burrowing mammal native to sub-Saharan Africa that feeds exclusively on ants and termites.</p>
    </idx:short>
</idx:entry>
So I finally did manage to figure out what Duden did, and unfortunately the answer is not necessarily what I wanted it to be. Basically it seems like I would have to actually make Each Inflection into its own separate entry to make it actually work. This code doesn't work, searching 'cúvol' only brings up the first entry:
Code:
 <idx:entry scriptable="yes">
		<h2>
		  <idx:orth value="cúvnuť">
		  <idx:infl>
		  <idx:iform value="cúvol" />
		  </idx:infl>
		  </idx:orth>
		  vor etw. kneifen
		  </idx:entry>
		  
 <idx:entry scriptable="yes">
		<h2>
		  <idx:orth value="cúvnuť">
		  <idx:infl>
		  <idx:iform value="cúvol" />
		  </idx:infl>		  
		  </idx:orth>
		  irgendetwas	pron; irgendwas	pron
		  </idx:entry>
 <idx:entry scriptable="yes">
		  <idx:orth value="cúvnuť">
		  <idx:infl>
		  <idx:iform value="cúvol" />
		  </idx:infl>
		  </idx:orth>		  
		  Test
		  </idx:entry>
but this does, accuratelz showing all three definitions as intended:
Code:
 <idx:entry scriptable="yes">
		<h2>
		  <idx:orth value="cúvol">
		  </idx:orth>
		  vor etw. kneifen
		  </idx:entry>
		  
 <idx:entry scriptable="yes">
		<h2>
		  <idx:orth value="cúvol">		  
		  </idx:orth>
		  irgendetwas	pron; irgendwas	pron
		  </idx:entry>
 <idx:entry scriptable="yes">
		  <idx:orth value="cúvol">
		  </idx:orth>		  
		  Test
		  </idx:entry>
Unless anyone has a solution for this that doesn't require making separate entries for every inflection (that would be an insane 60 entries for one word) I guess I'll have to live without. Bummer.
geotadams is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Where's the inflection information in official dictionaries? pheitzu Kindle Formats 1 09-24-2016 08:55 AM
Add same words to multiple titles Paltieldav Library Management 5 05-02-2016 01:41 PM
Dealing with Multiple Calibre libraries samoanbiscuit Calibre Companion 4 08-27-2014 09:35 AM
Dealing with multiple aliases crossi Library Management 2 04-06-2014 06:21 PM
Mobi dictionaries with inflection axolotl Kindle Formats 1 03-06-2012 07:13 AM


All times are GMT -4. The time now is 07:10 AM.


MobileRead.com is a privately owned, operated and funded community.