Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 05-20-2009, 04:06 PM   #1
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,393
Karma: 3161371
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Proposal: Extending Epub with reference book tags

I make my own reference books in Mobipocket, and I would like to do the same in Epub. Unfortunately, the needed tags don't exist yet in the Epub spec.

I'm going to get the ball rolling by listing the three details I've noticed about Mobipocket dictionaries (if I missed one please point it out):

1, an entry in an ebook's metada that indicates it's a dictionary (necessary?);

2, two more entries in the OPF that indicate the input and output languages;

3, the set of tags in the content that define the parts of a data entry (idx:entry, idxrth, idx:key, idx:short, idx:gramgrp, idx:subentry, idx:string, idx:ext-subentry). You can find more about them here.


I don't think all of the tags are necessary. Here is what I would like to propose as a starting point. Also, I'm going to be shameless and simply copy the function and attributes of the existing Mobipocket tags. I've changed some of the names so they are easier to understand.

So, what do you think?
Nate the great is offline   Reply With Quote
Old 05-20-2009, 05:12 PM   #2
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 9,540
Karma: 4597554
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2
What was wrong with idx (index) for these entries? It would make more sense the idpf I believe.

Dale
DaleDe is offline   Reply With Quote
 
Enthusiast
Old 05-20-2009, 05:42 PM   #3
wallcraft
reader
wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.
 
wallcraft's Avatar
 
Posts: 6,979
Karma: 5183568
Join Date: Mar 2006
Location: Mississippi, USA
Device: Kindle 3 and Fire
Why not use an existing XML-based dictionary format? Perhaps XDFX. The ePub standard has the concept of XML in-line islands, but I don't have a clear idea of what it takes to produce a dictionary that is also a valid ePub document.
wallcraft is offline   Reply With Quote
Old 05-20-2009, 05:57 PM   #4
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 9,540
Karma: 4597554
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2
Quote:
Originally Posted by wallcraft View Post
Why not use an existing XML-based dictionary format? Perhaps XDFX. The ePub standard has the concept of XML in-line islands, but I don't have a clear idea of what it takes to produce a dictionary that is also a valid ePub document.
A quick look shows that it is mostly compatible with XHTML already. See http://xdxf.revdanica.com/drafts/vis...-draft-028.txt

It has the advantages of already have dictionaries available. Perhaps there isn't much work to do to make everything work together.

Dale
DaleDe is offline   Reply With Quote
Old 05-20-2009, 07:15 PM   #5
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,393
Karma: 3161371
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Quote:
Originally Posted by DaleDe View Post
What was wrong with idx (index) for these entries? It would make more sense the idpf I believe.

Dale
Ah, is that what "idx" stands for? That makes sense. I'd prefer to use something that will add meaning and be easier to identify. I'm using "idpf" as a placeholder.
Nate the great is offline   Reply With Quote
Old 05-20-2009, 10:45 PM   #6
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,393
Karma: 3161371
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Quote:
Originally Posted by wallcraft View Post
Why not use an existing XML-based dictionary format? Perhaps XDFX. The ePub standard has the concept of XML in-line islands, but I don't have a clear idea of what it takes to produce a dictionary that is also a valid ePub document.
Interesting.

One problem is that XFDX isn't a standard yet. It's still in draft form. How can we adhere to something that will change in the near future? Also, I don't like that an article is identified by <ar> tag, and a keyword is identified by <k> tag. I'd prefer to spell out the whole word so it can be read easier.

Here is a pretty good source of information on XML.
Nate the great is offline   Reply With Quote
Old 05-21-2009, 11:55 AM   #7
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,393
Karma: 3161371
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
One thing I forgot to add last night was that while I don't want to adopt XFXD, I think that it's a good source of ideas. My original goal for this project was to add the tags I wanted to use right now. I've since realized that it might be better to include a larger set of tags so they can be used for more purposes. This lessens the chance that the extension will need to be revised in the future.
Nate the great is offline   Reply With Quote
Old 05-21-2009, 12:02 PM   #8
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 9,540
Karma: 4597554
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2
Quote:
Originally Posted by Nate the great View Post
Interesting.

One problem is that XFDX isn't a standard yet. It's still in draft form. How can we adhere to something that will change in the near future? Also, I don't like that an article is identified by <ar> tag, and a keyword is identified by <k> tag. I'd prefer to spell out the whole word so it can be read easier.

Here is a pretty good source of information on XML.
A agree with meaningful tag but short is preferred if you have to type it in. Perhaps you should input to XFDX, after all it is still in draft form.

Dale
DaleDe is offline   Reply With Quote
Old 05-21-2009, 12:21 PM   #9
GeoffC
Chocolate Grasshopper ...
GeoffC ought to be getting tired of karma fortunes by now.GeoffC ought to be getting tired of karma fortunes by now.GeoffC ought to be getting tired of karma fortunes by now.GeoffC ought to be getting tired of karma fortunes by now.GeoffC ought to be getting tired of karma fortunes by now.GeoffC ought to be getting tired of karma fortunes by now.GeoffC ought to be getting tired of karma fortunes by now.GeoffC ought to be getting tired of karma fortunes by now.GeoffC ought to be getting tired of karma fortunes by now.GeoffC ought to be getting tired of karma fortunes by now.GeoffC ought to be getting tired of karma fortunes by now.
 
GeoffC's Avatar
 
Posts: 26,896
Karma: 16968764
Join Date: Mar 2008
Location: Scotland
Device: Cybook Gen3 , Pocketbook 302 (Black) , Nexus 10: wife has PW
I don't know enough about how mobi/ePub work, nor how the dictionary function in mobi works - but grateful that someone is taking this on-board, if not for now then for the future.

Thanks....
GeoffC is offline   Reply With Quote
Old 05-21-2009, 10:37 PM   #10
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,393
Karma: 3161371
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Here is my next working set of tags:

Quote:
<idpf:article> </idpf:article> - required; root
<idpf:title> </idpf:title> - required; must be first in article element
<idpf:keyword> </idpf:keyword> - optional; can be nested inside any tag
<idpf:stub> </idpf:stub> - optional; part of article shown in a pop up window; can be anywhere
<idpf:entry> </idpf:entry> - required; has optional name attribute
<idpf:subentry> </idpf:subentry> - optional; has optional name attribute; must be inside entry or subentry
<idpf:data> </idpf:data> - required; has optional name attribute; has required type attribute: number, text, image, link, graph, (table?); must be inside entry or subentry
I wrote them with this page in mind. It's not the most complex article; but it's up there. Note: XML is only for the data, not the formatting.
Nate the great is offline   Reply With Quote
Old 05-22-2009, 12:50 AM   #11
jgray
Fanatic
jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.
 
Posts: 512
Karma: 1018067
Join Date: Mar 2008
Device: Galaxy Tab 10.1 & Note II
MS Reader dictionary format

Since MS Reader also does dictionaries (and nicely, too), I was curious what sort of markup Reader used. I downloaded the Dictionary Authoring Kit from here: http://www.microsoft.com/reader/deve...loads/dak.aspx

It is a self extracting archive, so I just unzipped it and in the Documentation folder, found "dak.chm". It seems that Microsoft uses a subset of TEI tags for Reader dictionaries. Very interesting that they used an existing standard.

I wonder if this existing method would be good to incorporate into epub, rather than reinventing the wheel? Of course, regardless of what method is used, we still need to wait until reading software supports dictionary lookup.

Here is a sample entry that I pasted from the "dak.chm" file:

Code:
Sample Dictionary Fragment
A typical EBDICT dictionary fragment might look like this:

<tei-ms:text>
 <tei-ms:body>
  <tei-ms:div0>
   <tei-ms:div1>
    <tei-ms:div2>
     <tei-ms:entry>
      <tei-ms:form>
       <tei-ms:orth>dictionary</tei-ms:orth>
       <tei-ms:syll>dic|tion|ar|y</tei-ms:syll>
      </tei-ms:form>
      <tei-ms:gramGrp><tei-ms:pos>n</tei-ms:pos></tei-ms:gramGrp>
      <tei-ms:sense n="1">
       <tei-ms:def>
       A reference book that contains words listed in alphabetical order and gives explanations of their meanings, often with additional information about grammar, pronunciation, and etymology.
       </tei-ms:def>
      </tei-ms:sense>
      <tei-ms:sense n="2">
       <tei-ms:def>
       A foreign-language reference book of words: a reference book that gives equivalents of words and phrases in two or more languages, often with translations from each language to the other in separate sections.
       </tei-ms:def>
       <tei-ms:eg>A Spanish-English dictionary</tei-ms:eg>
      </tei-ms:sense>
     </tei-ms:entry>
    </tei-ms:div2>
   </tei-ms:div1>
  </tei-ms:div0>
 </tei-ms:body>
</tei-ms:text> 

where:

<tei-ms:entry> delimits an entry 
<tei-ms:orth> gives the orthographic (written) form of the headword 
<tei-ms:syll> gives the syllabification 
<tei-ms:pos> specifies the part of speech (in this case, a noun) 
<tei-ms:sense> gives information about a particular sense of the word 
<tei-ms:def> gives the definition of the word in that sense 
<tei-ms:eg> gives an example of the usage of the word in that sense
I hope that the folks at IDPF are working on some type of dictionary format for epub. I commented last year over on Teleread that I thought dictionary support was something that was badly needed in epub.
jgray is offline   Reply With Quote
Old 05-27-2009, 12:46 PM   #12
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,393
Karma: 3161371
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Quote:
Originally Posted by jgray View Post
I wonder if this existing method would be good to incorporate into epub, rather than reinventing the wheel? Of course, regardless of what method is used, we still need to wait until reading software supports dictionary lookup.

I hope that the folks at IDPF are working on some type of dictionary format for epub. I commented last year over on Teleread that I thought dictionary support was something that was badly needed in epub.
I don't think the MSReader tags should be adopted, but you do have a good point. I'm now leaning towards recommending the adoption of XFXD tags as an extension to Epub. What does everyone think?
Nate the great is offline   Reply With Quote
Old 05-27-2009, 07:15 PM   #13
jgray
Fanatic
jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.
 
Posts: 512
Karma: 1018067
Join Date: Mar 2008
Device: Galaxy Tab 10.1 & Note II
Quote:
Originally Posted by Nate the great View Post
I don't think the MSReader tags should be adopted, but you do have a good point. I'm now leaning towards recommending the adoption of XFXD tags as an extension to Epub. What does everyone think?
I wasn't saying that the MS tags should be used specifically. However, since MS based their tags on TEI, I was wondering if IDPF couldn't do the same? If not TEI, then some other existing standard. Since epub is already based on existing standards, this would make more sense than starting from scratch for dictionary support.
jgray is offline   Reply With Quote
Old 05-27-2009, 07:16 PM   #14
jgray
Fanatic
jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.
 
Posts: 512
Karma: 1018067
Join Date: Mar 2008
Device: Galaxy Tab 10.1 & Note II
BTW, do you have a link to some info about XFXD. Google isn't being helpful.
jgray is offline   Reply With Quote
Old 05-27-2009, 07:22 PM   #15
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,393
Karma: 3161371
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Wallcraft posted a link early in the thread:
http://xdxf.revdanica.com/

I got the letter order wrong.
Nate the great is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Seeking advice: My reference book Steven Lyle Jordan Writers' Corner 31 11-30-2009 09:49 AM
Reference Guide: How to Prepare Images for EPUB (and other) Formats Zorba ePub 13 11-22-2009 08:28 AM
Snipped from Proposal: Extending Epub Nate the great ePub 30 06-07-2009 07:32 AM
E-book for Reference QFT Which one should I buy? 8 10-17-2008 10:56 PM
E-books worm into reference book market Bob Russell News 0 09-23-2005 09:12 PM


All times are GMT -4. The time now is 11:50 AM.


MobileRead.com is a privately owned, operated and funded community.