05-20-2009, 04:06 PM | #1 | |
Sir Penguin of Edinburgh
Posts: 12,375
Karma: 23555235
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
|
Proposal: Extending Epub with reference book tags
I make my own reference books in Mobipocket, and I would like to do the same in Epub. Unfortunately, the needed tags don't exist yet in the Epub spec.
I'm going to get the ball rolling by listing the three details I've noticed about Mobipocket dictionaries (if I missed one please point it out): 1, an entry in an ebook's metada that indicates it's a dictionary (necessary?); 2, two more entries in the OPF that indicate the input and output languages; 3, the set of tags in the content that define the parts of a data entry (idx:entry, idxrth, idx:key, idx:short, idx:gramgrp, idx:subentry, idx:string, idx:ext-subentry). You can find more about them here. I don't think all of the tags are necessary. Here is what I would like to propose as a starting point. Also, I'm going to be shameless and simply copy the function and attributes of the existing Mobipocket tags. I've changed some of the names so they are easier to understand. Quote:
So, what do you think? |
|
05-20-2009, 05:12 PM | #2 |
Grand Sorcerer
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
What was wrong with idx (index) for these entries? It would make more sense the idpf I believe.
Dale |
05-20-2009, 05:42 PM | #3 |
reader
Posts: 6,975
Karma: 5183568
Join Date: Mar 2006
Location: Mississippi, USA
Device: Kindle 3, Kobo Glo HD
|
Why not use an existing XML-based dictionary format? Perhaps XDFX. The ePub standard has the concept of XML in-line islands, but I don't have a clear idea of what it takes to produce a dictionary that is also a valid ePub document.
|
05-20-2009, 05:57 PM | #4 | |
Grand Sorcerer
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
Quote:
It has the advantages of already have dictionaries available. Perhaps there isn't much work to do to make everything work together. Dale |
|
05-20-2009, 07:15 PM | #5 |
Sir Penguin of Edinburgh
Posts: 12,375
Karma: 23555235
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
|
Ah, is that what "idx" stands for? That makes sense. I'd prefer to use something that will add meaning and be easier to identify. I'm using "idpf" as a placeholder.
|
05-20-2009, 10:45 PM | #6 | |
Sir Penguin of Edinburgh
Posts: 12,375
Karma: 23555235
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
|
Quote:
One problem is that XFDX isn't a standard yet. It's still in draft form. How can we adhere to something that will change in the near future? Also, I don't like that an article is identified by <ar> tag, and a keyword is identified by <k> tag. I'd prefer to spell out the whole word so it can be read easier. Here is a pretty good source of information on XML. |
|
05-21-2009, 11:55 AM | #7 |
Sir Penguin of Edinburgh
Posts: 12,375
Karma: 23555235
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
|
One thing I forgot to add last night was that while I don't want to adopt XFXD, I think that it's a good source of ideas. My original goal for this project was to add the tags I wanted to use right now. I've since realized that it might be better to include a larger set of tags so they can be used for more purposes. This lessens the chance that the extension will need to be revised in the future.
|
05-21-2009, 12:02 PM | #8 | |
Grand Sorcerer
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
Quote:
Dale |
|
05-21-2009, 12:21 PM | #9 |
Chocolate Grasshopper ...
Posts: 27,600
Karma: 20821184
Join Date: Mar 2008
Location: Scotland
Device: Muse HD , Cybook Gen3 , Pocketbook 302 (Black) , Nexus 10: wife has PW
|
I don't know enough about how mobi/ePub work, nor how the dictionary function in mobi works - but grateful that someone is taking this on-board, if not for now then for the future.
Thanks.... |
05-21-2009, 10:37 PM | #10 | |
Sir Penguin of Edinburgh
Posts: 12,375
Karma: 23555235
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
|
Here is my next working set of tags:
Quote:
|
|
05-22-2009, 12:50 AM | #11 |
Fanatic
Posts: 548
Karma: 2928497
Join Date: Mar 2008
Device: Clara 2E & Sage
|
MS Reader dictionary format
Since MS Reader also does dictionaries (and nicely, too), I was curious what sort of markup Reader used. I downloaded the Dictionary Authoring Kit from here: http://www.microsoft.com/reader/deve...loads/dak.aspx
It is a self extracting archive, so I just unzipped it and in the Documentation folder, found "dak.chm". It seems that Microsoft uses a subset of TEI tags for Reader dictionaries. Very interesting that they used an existing standard. I wonder if this existing method would be good to incorporate into epub, rather than reinventing the wheel? Of course, regardless of what method is used, we still need to wait until reading software supports dictionary lookup. Here is a sample entry that I pasted from the "dak.chm" file: Code:
Sample Dictionary Fragment A typical EBDICT dictionary fragment might look like this: <tei-ms:text> <tei-ms:body> <tei-ms:div0> <tei-ms:div1> <tei-ms:div2> <tei-ms:entry> <tei-ms:form> <tei-ms:orth>dictionary</tei-ms:orth> <tei-ms:syll>dic|tion|ar|y</tei-ms:syll> </tei-ms:form> <tei-ms:gramGrp><tei-ms:pos>n</tei-ms:pos></tei-ms:gramGrp> <tei-ms:sense n="1"> <tei-ms:def> A reference book that contains words listed in alphabetical order and gives explanations of their meanings, often with additional information about grammar, pronunciation, and etymology. </tei-ms:def> </tei-ms:sense> <tei-ms:sense n="2"> <tei-ms:def> A foreign-language reference book of words: a reference book that gives equivalents of words and phrases in two or more languages, often with translations from each language to the other in separate sections. </tei-ms:def> <tei-ms:eg>A Spanish-English dictionary</tei-ms:eg> </tei-ms:sense> </tei-ms:entry> </tei-ms:div2> </tei-ms:div1> </tei-ms:div0> </tei-ms:body> </tei-ms:text> where: <tei-ms:entry> delimits an entry <tei-ms:orth> gives the orthographic (written) form of the headword <tei-ms:syll> gives the syllabification <tei-ms:pos> specifies the part of speech (in this case, a noun) <tei-ms:sense> gives information about a particular sense of the word <tei-ms:def> gives the definition of the word in that sense <tei-ms:eg> gives an example of the usage of the word in that sense |
05-27-2009, 12:46 PM | #12 | |
Sir Penguin of Edinburgh
Posts: 12,375
Karma: 23555235
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
|
Quote:
|
|
05-27-2009, 07:15 PM | #13 |
Fanatic
Posts: 548
Karma: 2928497
Join Date: Mar 2008
Device: Clara 2E & Sage
|
I wasn't saying that the MS tags should be used specifically. However, since MS based their tags on TEI, I was wondering if IDPF couldn't do the same? If not TEI, then some other existing standard. Since epub is already based on existing standards, this would make more sense than starting from scratch for dictionary support.
|
05-27-2009, 07:16 PM | #14 |
Fanatic
Posts: 548
Karma: 2928497
Join Date: Mar 2008
Device: Clara 2E & Sage
|
BTW, do you have a link to some info about XFXD. Google isn't being helpful.
|
05-27-2009, 07:22 PM | #15 |
Sir Penguin of Edinburgh
Posts: 12,375
Karma: 23555235
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
|
Wallcraft posted a link early in the thread:
http://xdxf.revdanica.com/ I got the letter order wrong. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Seeking advice: My reference book | Steven Lyle Jordan | Writers' Corner | 31 | 11-30-2009 09:49 AM |
Reference Guide: How to Prepare Images for EPUB (and other) Formats | Zorba | ePub | 13 | 11-22-2009 08:28 AM |
Snipped from Proposal: Extending Epub | Nate the great | ePub | 30 | 06-07-2009 07:32 AM |
E-book for Reference | QFT | Which one should I buy? | 8 | 10-17-2008 10:56 PM |
E-books worm into reference book market | Bob Russell | News | 0 | 09-23-2005 09:12 PM |