Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Other formats

Notices

Reply
 
Thread Tools Search this Thread
Old 12-27-2009, 05:17 PM   #16
rogue_ronin
Banned
rogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-books
 
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
Quote:
In any event, as I've said, Sigil should store any meta tags it can't recognize and convert to DC. These should then be exported to the OPF as they were, that is, bare meta tags (the spec supports this).
Does Sigil auto-generate the opf:file-as attribute when creating the OPF file?

m a r
rogue_ronin is offline   Reply With Quote
Old 12-27-2009, 08:06 PM   #17
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
Quote:
Originally Posted by rogue_ronin View Post
Does Sigil auto-generate the opf:file-as attribute when creating the OPF file?
If you write for instance author as "Doe, John" then that will be used as file-as and "John Doe" will be used as the standard value. But notice the comma.

And if your epub file has creator/contributor file-as, then that is loaded instead of the value.

It's rudimentary, but supports about 90% of use cases.
Valloric is offline   Reply With Quote
Advert
Old 12-27-2009, 10:14 PM   #18
rogue_ronin
Banned
rogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-books
 
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
Got it. Don't think that will work for a general-case XHTML file, though.

Have to keep thinking on it.

m a r
rogue_ronin is offline   Reply With Quote
Old 12-28-2009, 04:06 PM   #19
rogue_ronin
Banned
rogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-books
 
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
I've updated the Sigil list with KevinH's additions, here.

Still looking for suggestions and guidance on XHTML metadata encoding...

m a r
rogue_ronin is offline   Reply With Quote
Old 12-29-2009, 12:51 PM   #20
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,602
Karma: 5433388
Join Date: Nov 2009
Device: many
Some suggestions

Hi,

Some thoughts, for what they are worth.

At first pass, I would try to stick with dc as much as possible when looking at extensions. Implementing an additional subset from the dcterms. namespace might be a good thing to do because the epub spec may grow to include them someday. And, using the DCTERMS.namespace means they can be easily mapped to "name", "content" pairs that can be stored and then passed through Sigil to the output opf file without really having to process them:

These might include things like:

DCTERMs.abstract - "a short summary of the resource"
although this may be superseded by DC.description

DCTERMS.alternative - "alternative title"


DCTERMS.audience - "audience the resource is intended for" (ie.. children, vs adult or PG-13 or Teens or ...)


Additional date events to record:

DCTERMS.dateAccepted - "date of acceptance of the resource"

DCTERMS.dateCopyrighted" - "date of copyright"

DCTERMS.dateSubmitted - "date of submission" (ie. for a thesis or dissertation")



Additional license qualifiers:

DCTERMS.license - "license to use the resource" (public domain, etc)

DCTERMS.provenance - "statement of changes in ownership"

DCTERMS.rightsHolder - "person or org who owns the rights"

DCTERMS.accessRights - "who can access it, security status"



And the following two fields that qualify "Coverage":

DCTERMS.spatial - "spatial or geographic coverage"

DCTERMS.temporal - "time period covered"

Although you could argue that the standard DC.coverage is enough



And only the most basic "Relation" qualifiers:

DCTERMS.hasPart, DCTERMS.isPartOf

The hasPart can be the number in the Series, and isPartOf can be the "Series" name itself


DCTERMS.hasVersion, DCTERMS.isVersionOf

To allow support for different versions of the same book. Think "Jules Verne's Journey to the Centre of the Earth" - the original English translation versus a modern translation to English directly from French. They are very very different - in fact many older translators took "extreme liberties" to "enhance" the book they were translating for their audience.


All of the above would only be "passed through" Sigil and not processed for editing and things.

Then I think we have to go outside the DCTERMS namespace but only for the select few that really matter most:

Something along the lines of

YOURNAMESPACEHERE.name content="something".

I think the smallest most relevant subset should be the goal and not trying to replace all of the information from the card catalog system would be best.

My 2 cents,

...

KevinH
KevinH is online now   Reply With Quote
Advert
Old 12-29-2009, 01:13 PM   #21
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,602
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi,

One other thing... I do not think we should keep track of concurrent versioning information in the meta data of the ebook.

So for example:

<meta name="FileName" content="FILENAME.EXT" />
<meta name="FileVersion" content="VERSION NUMBER/NAME" />
<meta name="FileScanner" content="NAME" />
<meta name="FileComment" content="COMMENT" />

These should not be in "released" eBooks. They should in fact be tracked by the concurrent versioning system used to keep track of editing changes and things *before* a version of the book is released.

For example, CVS, Mercurial, etc are all source code versioning systems that can be adapted to support concurrent editing and versioning of ebooks being worked on. That system would keep track of editorial changes, who made them, when they were made, the files changed, etc.

That information need not be part of the metadata of an "official release" of an eBook, in much the same way that the specific changes made to software, by whom, and when is not actually part of the information made when the software is released, it is kept internally only.


That said, I can see that many different organizations may make their own releases of the exact same public domain book, and as such, we do need to see the group doing the release.

If this fits under "DC.Publisher" then all is fine. If not, then we should probably add a specific "Generator" meta element to capture this infromation:

So something like.

name="Generator" content="org or person making the release"

as free form metadata, or

YOURNAMESPACEHERE.generator style.

Again, all of this is my 2 cents, feel free to ignore all of it.

Take care,

KevinH
KevinH is online now   Reply With Quote
Old 12-29-2009, 10:40 PM   #22
rogue_ronin
Banned
rogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-books
 
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
Quote:
Originally Posted by KevinH View Post
...At first pass, I would try to stick with dc as much as possible when looking at extensions. Implementing an additional subset from the dcterms. namespace might be a good thing to do because the epub spec may grow to include them someday. And, using the DCTERMS.namespace means they can be easily mapped to "name", "content" pairs that can be stored and then passed through Sigil to the output opf file without really having to process them...
Hmmm... well, the DCTERMS namespace contains equivalents of all the DC namespace terms, so it sorta makes sense to migrate to that, doesn't it? (And as you are grabbing all the common DC/DCTERMS terms in Sigil, too, it wouldn't be incompatible.) Any conversion to ePub would be super-easy if there were only the DCTERMS... and am I mistaken in thinking that if I were to do a proper namespace declaration in the head, that I could include both? (Probably necessitating redundancy, but that's not a big deal.)

I'm saying this, however, in an effort to come to the simplest, coherent list -- thus preferring to stick with one namespace, probably DCTERMS. Some of your suggestions below support this thought.

(I'm going to take some of your response out of order, here...)

Quote:
One other thing... I do not think we should keep track of concurrent versioning information in the meta data of the ebook.

So for example:

<meta name="FileName" content="FILENAME.EXT" />
<meta name="FileVersion" content="VERSION NUMBER/NAME" />
<meta name="FileScanner" content="NAME" />
<meta name="FileComment" content="COMMENT" />

These should not be in "released" eBooks. They should in fact be tracked by the concurrent versioning system used to keep track of editing changes and things *before* a version of the book is released.
I can see your argument regarding versioning. It does make sense to keep the version info in the content management software. I keep it in my files because it is easy to grab that info along with everything else when I open a project.

And that's probably because I don't keep old versions, though I do keep a list of former versions and modifications in the metadata. I also include a "version guide" in the metadata, that shows what the versions actually mean (so that they don't just have an arbitrary, undefined "improved" value.)

EG:
Code:
	<!-- BEGIN: FILE HISTORY -->

	<!-- Created on 2009-12-28 -->
	<!-- Revision # 0.10 on 2009-12-28 -->
	<!-- Current Revision # 0.50 on 2009-12-28 -->

	<!-- END: FILE HISTORY -->

	<!-- BEGIN: REVISION GUIDELINE -->

	<!--
	0.10 :: Initial Conversion
	0.20 :: Cover and Frontispiece
	0.30 :: Sections, Chapters and TOC
	0.40 :: Endnotes and/or Blockquotes
	0.50 :: Initial Spellcheck
	0.60 :: M-Dashes, Hyphens and Ellipses
	0.70 :: Italics, Bold, and Pre-Formatted Text
	0.80 :: Reading Proof
	0.90 :: Checked Against Canonical Source
	1.00 :: Final Version = Optimal
	1.++ :: Minor Error Corrections
	-->

	<!-- END: REVISION GUIDELINE -->
Still, having a modified date in the metadata is kind of the same thing, isn't it -- it just doesn't give you a sense of progress, or perfectedness, does it?

If one's use-model includes people sharing files and improving them (as mine does), modified dates may not be enough though, to indicate the relative value of individual files. Sort of like software version numbers -- those are a ready measure of a type of value. It's also something that happens "in the wild."

FileNames may not matter -- it can usually be determined by a system call of some sort (but is there a case for knowing the original filename? It might allow for auto-renaming of related files such as images... Too speculative?)

FileScanner will probably have to wait for a MARC Relator Code to catch up to reality.

And FileComment -- I guess I keep thinking how and/or why a file/book has been created might be interesting or relevant (to research, or somesuch.) It's something that I use to auto-generate a Colophon, too, where such info is often expected. Don't Project Gutenberg texts often include such comments?

Note that a lot of my reasoning has to do with the file being encountered by someone Not-The-Producer, and suggesting to that N-T-P ways to keep a good, comprehensible accounting of their own work, as well as giving them as excellent a context as possible for understanding the current file.

Regarding your observations on the DCTERMS namespace:

Quote:
Originally Posted by KevinH View Post
These might include things like:

DCTERMs.abstract - "a short summary of the resource"
although this may be superseded by DC.description

DCTERMS.alternative - "alternative title"

DCTERMS.audience - "audience the resource is intended for" (ie.. children, vs adult or PG-13 or Teens or ...)

Additional date events to record:

DCTERMS.dateAccepted - "date of acceptance of the resource"

DCTERMS.dateCopyrighted" - "date of copyright"

DCTERMS.dateSubmitted - "date of submission" (ie. for a thesis or dissertation")

Additional license qualifiers:

DCTERMS.license - "license to use the resource" (public domain, etc)

DCTERMS.provenance - "statement of changes in ownership"

DCTERMS.rightsHolder - "person or org who owns the rights"

DCTERMS.accessRights - "who can access it, security status"

And the following two fields that qualify "Coverage":

DCTERMS.spatial - "spatial or geographic coverage"

DCTERMS.temporal - "time period covered"

Although you could argue that the standard DC.coverage is enough

And only the most basic "Relation" qualifiers:

DCTERMS.hasPart, DCTERMS.isPartOf

The hasPart can be the number in the Series, and isPartOf can be the "Series" name itself

DCTERMS.hasVersion, DCTERMS.isVersionOf

To allow support for different versions of the same book. Think "Jules Verne's Journey to the Centre of the Earth" - the original English translation versus a modern translation to English directly from French. They are very very different - in fact many older translators took "extreme liberties" to "enhance" the book they were translating for their audience.

All of the above would only be "passed through" Sigil and not processed for editing and things.
I think this is really good stuff. I'm going to take the above ideas and see if I can generate a set of (relatively) simple meta tags in the next post I make.

I'll separate it into a variant of my current scheme, and some possible additions based on your suggestions here.

Quote:
Then I think we have to go outside the DCTERMS namespace but only for the select few that really matter most:

Something along the lines of

YOURNAMESPACEHERE.name content="something".

I think the smallest most relevant subset should be the goal and not trying to replace all of the information from the card catalog system would be best.
I'm with you there, but making our own namespace! I just started XHTML a few months ago... And if we're going to make our own namespace, we should just do everything to our own satisfaction, and ditch the DC stuff (or rather, steal freely and scratch off the serial numbers.)

From my list there's not much remaining, though, so it probably isn't necessary, unless you have some further suggestions. I don't want to recreate the entire card catalog, either, but we probably already have done most of one via DC/DCTERMS.

That file-as attribute is still a kneebiter, though.

Give me your comments on the next post if you still have interest! (That post will be up in an hour or two, I think.)

m a r
rogue_ronin is offline   Reply With Quote
Old 12-30-2009, 01:31 AM   #23
rogue_ronin
Banned
rogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-books
 
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
Selected DCTERMS XHTML Metadata Items...

These are the ones that I currently think are important enough to include:

Identifier
<meta name="DCTERMS.identifier" scheme="SCHEME NAME" content="SCHEME CODE" />

Title
<meta name="DCTERMS.title" content="TITLE" />

Author
<meta name="DCTERMS.creator.aut" content="NAME" />

Series Name
<meta name="DCTERMS.isPartOf" content="SERIES NAME" />

Series Number
<meta name="DCTERMS.hasPart" content="SERIES NUMBER" />

Type
<meta name="DCTERMS.type" content="GENRE or CLASSIFICATION" />

Subject
<meta name = "DCTERMS.subject" content="KEYWORD(S)" />

Description
<meta name="DCTERMS.description" content="DESCRIPTION OF CONTENT" />

Publisher
<meta name="DCTERMS.publisher" content="PUBLISHER DATA" />

Publication Date
<meta name="DCTERMS.issued" content="YYYY(-MM(-DD))" />

Creation Date
<meta name="DCTERMS.created" content="YYYY(-MM(-DD))" />

Modification Date
<meta name="DCTERMS.modified" content="YYYY(-MM(-DD))" />

Copyright Date
<meta name="DCTERMS.dateCopyrighted" contents="YYYY(-MM(-DD))" />

Copyright Holder
<meta name="DCTERMS.rightsHolder" contents="NAME/ORG." />

Copyright Status
<meta name="DCTERMS.license" contents="LICENSE/STATUS" />

Language
<meta name="DCTERMS.language" content="TWO-LETTER LANGUAGE CODE" />

Source
<meta name="DCTERMS.source" content="SOURCE DERIVED FROM" />

==================

Any additional creator or contributor may be added using the over 200 MARC Relator Codes:

Illustrator
<meta name="DCTERMS.creator.ill" content="NAME" />

Proofreader
<meta name="DCTERMS.contributor.pfr" content="NAME" />

Editor
<meta name="DCTERMS.contributor.edt" content="NAME" />

Cover Designer
<meta name="DCTERMS.contributor.cov" content="NAME" />

==================

Extensions that don't meet the DC spec, but do meet the ePub spec:

File-As
<meta name="DC.creator.aut" scheme="FileAs:Lastname, First Middle" content="Dr. First Middle Lastname, Esq." />
-- Part of the ePub spec, but generally useful to define document sorting. The scheme attribute will be ignored by any parser as an unknown scheme.

==================

Others that maybe SHOULD be included (please make an argument against):

==

Abstract
<meta name="DCTERMS.abstract" content="SUMMARY OF CONTENT" />

-- I could be talked into this one. More useful for non-fiction.

==

Alternative Title
<meta name="DCTERMS.alternative" content="TITLE" />

-- Alternate Title, like a foreign name, or earlier (maybe offensive) name. This is more common than I originally thought.

==

Audience
<meta name="DCTERMS.audience" content="INTENDED AUDIENCE" />

-- Like age-ranges, or... something else? "Young Adult" is a really popular category at the moment.

==

==================

Others that should maybe NOT be included (please make an argument in favor):

Format
<meta name="DCTERMS.format" content="MEDIA/FILE TYPE" />

--I'm of a mind that it being an eBook, you're already pretty sure of the media and/or filetype.

==

Relation
<meta name="DCTERMS.relation" content="RELATED RESOURCE" />

-- The two refinements of this that allow us to keep Series Name and Series Number seem adequate.

==

Coverage
<meta name="DCTERMS.coverage" content="TIME, SPACE, or OTHER SPAN" />

-- Meh. Subject seems enough.

==

Provenance
<meta name="DCTERMS.provenance" content="OWNERSHIP HISTORY" />

-- I think this is about the actual, physical resource. I'm not sure it's relevant to an ebook.

==

Access Rights
<meta name="DCTERMS.accessRights" content="PERMISSION(S) TO ACCESS" />

-- Things like age restrictions, etc. Talk me into it. It'll be hard, I'm against most restrictions, even normal ones.

==

Date of Acceptance
<meta name="DCTERMS.dateAccepted" content="YYYY(-MM(-DD))" />

-- Some certifying authority acknowledges receipt/acceptance of a document. Meh.

==

Date of Submission
<meta name="DCTERMS.dateSubmitted" content="YYYY(-MM(-DD))" />

-- Some certifying authority is given a document. Double-meh.

==

Geographical/Spatial Coverage
<meta name="DCTERMS.spatial" content="SPATIAL RANGE" />

-- Seems unnecessarily redundant of the Coverage tag.

==

Date/Temporal Coverage
<meta name="DCTERMS.temporal" content="TEMPORAL RANGE" />

-- Also seems unnecessarily redundant of the Coverage tag.

==

Has Version
<meta name="DCTERMS.hasVersion" content="TITLE/NAME" />

-- Indicates another resource that is adapted from this one.

==

Is Version Of
<meta name="DCTERMS.isVersionOf" content="TITLE/NAME" />

-- Indicates a resource that this resource was adapted from.

==================

Undefinable by DCTERMS, but possibly desired metadata:

File Name
-- The original name of the eBook file.

File Version
-- Using a defined versioning scheme. It's also a bit like a "#th Printing" statement.

File Comment
-- Information about how/why the ebook file was created.

Sub-title
-- Lots of books have these.

Publication City
-- Commonly used. Might be growing less relevant in the digital age.

==================

I'm always open to input, corrections and suggestions!

There are other ways to code this, but I'm looking for a relatively simple, consistent method that covers most everything. The DCTERMS namespace seems to be that method, as the DC namespace is more limited and requires a somewhat vague extension ("refinements").

Also, all the DCTERMS can be defined this way in XHTML, but the questions here are: What is generally useful for eBooks? What are absolutely necessary, what are not?

I'll update this post as it gets better defined.

m a r

ps: huge props to KevinH!

Last edited by rogue_ronin; 01-01-2010 at 05:52 PM. Reason: updated File-As proposal and reworked the "possibles".
rogue_ronin is offline   Reply With Quote
Old 12-30-2009, 11:34 AM   #24
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,602
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi,

Nice list ...

Two things. The DC namespace is what the main epub spec is built on and both DCTERMS. and DC. and refinements are already supported where they overlap with the epub spec so your first list of valid metadata recognized now is still the main one.

So all we need think about is what to **add** to the epub spec and it looks like you agree with me that using DCTERMs. as the main source of these additions is the way to go.

2. So if we just look at the additions to what is already covered by Sigil/ and the epub spec you are suggesting the following, is that correct?

Series Name
<meta name="DCTERMS.isPartOf" content="SERIES NAME" />

Series Number
<meta name="DCTERMS.hasPart" content="SERIES NUMBER" />

Copyright Date
<meta name="DCTERMS.dateCopyrighted" contents="YYYY(-MM(-DD))" />

Copyright Holder
<meta name="DCTERMS.rightsHolder" contents="NAME/ORG." />

Copyright Status
<meta name="DCTERMS.license" contents="LICENSE/STATUS" />


and then from non-dc / non-dcterms you are suggesting we add the following:

File Name
-- The original name of the eBook file.

File Version
-- Using a defined versioning scheme. It's also a bit like a "#th Printing" statement.

File Comment
-- Information about how/why the ebook file was created.

File-As
-- Part of the ePub spec, but generally useful to define document sorting.

Sub-title
-- Lots of books have these.

Publication City
-- Commonly used. Might be growing less relevant in the digital age.


Is that correct?

If so, I think that is a good list. Perhaps we could encourage others to add their two cents and see what they think.

I wish there was a way to "advertise" this topic to all people interested in book metadata on this forum to get more input.

Thanks,

Kevin
KevinH is online now   Reply With Quote
Old 12-30-2009, 12:01 PM   #25
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
Quote:
Originally Posted by KevinH View Post
So all we need think about is what to **add** to the epub spec and it looks like you agree with me that using DCTERMs. as the main source of these additions is the way to go.
I just want to say that anything you want to *add* has to be already valid as per the spec. If you merely want to add <meta> tags to the OPF, that's fine by me since the spec says they can have whatever format they like (key--value pairs).

But anything beyond that I don't support.
Valloric is offline   Reply With Quote
Old 12-30-2009, 12:23 PM   #26
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,602
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi Valloric,

Understood. By **add** I only meant over and above what Sigil/epub already supports, **not** that we would add additional things to the epub spec.

The plan is, after getting more input on what is useful, I would submit changes to you to approve that just pass through to the opf file (and reading back in if Sigil loads an epub) all of these additions so that they would not be lost or ignored as they would be now.

So then the docs would eventually highlight the metadata that is fully supported by Sigil and the epub spec (see the earlier post), and then a set of recommendations for additions to use that will only be passed through to the opf file so that they not be lost or ignored.

Then people who create metadata inside html files and for ebooks, can at least know what will be supported and what will simply be retained versus what will be ignored or lost.

Sound good?

KevinH
KevinH is online now   Reply With Quote
Old 12-30-2009, 12:27 PM   #27
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
Quote:
Originally Posted by KevinH View Post
Then people who create metadata inside html files and for ebooks, can at least know what will be supported and what will simply be retained versus what will be ignored or lost.

Sound good?
Very good, yes.
Valloric is offline   Reply With Quote
Old 12-30-2009, 04:05 PM   #28
rogue_ronin
Banned
rogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-books
 
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
@Kevin: Yeah, those are the basic additions I'm looking at. There's also some stuff in the second section that I think might be good to use as well, but I'm not certain. I'd like it if more folks chimed in, too, but I suspect it takes a rare type of OCD to work on this stuff!

I'm trying to work out a set of basic, but reasonably thorough, XHTML metadata, preferably in a Dublin Core format, that is consistent with what Sigil uses or recognizes (because Sigil is the first app to take such a thing seriously.)

Technically, Sigil supports (or will support) the entire DC, because it will pass through all valid <meta> tags. So, technically, there's nothing to discuss in that area. But, of course, figuring out what is actually useful when creating an eBook, and putting together a simple list of what to use (from the myriad possibilities) is where this thread should work itself out. This XHTML eBook metadata stands somewhat apart from whatever form it may take later (particularly in a Sigil ePub.)

The most recent list is using entirely DCTERMS because it's consistent, is a superset of the DC namespace, and enables us to encode a larger set of metadata in a more specific way. The suggestions you made were spot-on; all I did was package them up nicely.

Since Sigil looks for DCTERMS as well as DC, there's no reason to mix different namespaces in this recommendation/spec. While Sigil's output will be only valid ePub spec, and thus may use the DC namespace, there's no reason to limit the input to that space since there is logic built into it to recognize a larger set of metadata -- and the resulting XHTML is simpler, more readable and consistent. (Makes it look like some actual thought went into it!)

As you've recognized, what Sigil understands on input, and what might be available in the metadata, are different lists. Someone could make a simple list of free-form terms to use; in fact, for everything that matches the ePub spec, it would be nice if there were a Sigil-specific free-form list.

Now, as to the stuff that cannot be matched to DCTERMS: simple enough, really -- just turn 'em into basic <meta> tags...

File Name
<meta name="FileName" content="FILENAME.EXT">

File Version
<meta name="FileVersion" content="VERSION NUMBER">

File Comment
<meta name="FileComment" content="COMMENT">

File-As
<meta name="FileAs" content="LASTNAME, FIRST MIDDLE">

Sub-title
<meta name="SubTitle" content="SUBTITLE">

Publication City
<meta name="PublicationCity" content="CITY NAME">

(I think we're getting new ePub spec this year -- maybe some of these will be included. I'm hoping for "Sub-title.")

I'd love to hear if someone can think of a way to map these to the DCTERMS. I'm also open to further arguments against them. I may be married to FileName, for instance, because I'm using it in my process so much.

@Valloric: Of the above, Sigil will largely just pass them through to the OPF: the only question is, is it reasonable that Sigil should recognize the File-As tag (much as it recognizes Author or Title)? There should only be one such tag, so it could sensibly be mapped to the primary creator.

On the other DCTERMS in the prior list: I'd love to hear some arguments, particularly for Abstract, Alternative Title and Audience. I tend to come from a fiction-book perspective, and might need some schooling on non-fiction.

m a r

Last edited by rogue_ronin; 12-30-2009 at 04:08 PM.
rogue_ronin is offline   Reply With Quote
Old 12-30-2009, 05:31 PM   #29
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
Quote:
Originally Posted by rogue_ronin View Post
the only question is, is it reasonable that Sigil should recognize the File-As tag (much as it recognizes Author or Title)? There should only be one such tag, so it could sensibly be mapped to the primary creator.
Who says? I'm sorry, but you can't guess.
Valloric is offline   Reply With Quote
Old 12-30-2009, 06:01 PM   #30
rogue_ronin
Banned
rogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-books
 
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
<shatner>Damn you, File-As!!!</shatner>



m a r

Last edited by rogue_ronin; 12-30-2009 at 08:15 PM.
rogue_ronin is offline   Reply With Quote
Reply

Tags
dublin core, epub, howto, metadata, xhtml

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Core runtime libs package for iRex DRs Iñigo iRex 11 12-14-2010 06:23 PM
Seriously thoughtful What Heats the Earth's Core? kennyc Lounge 58 10-07-2010 08:05 AM
Hello from Dublin, Ireland jaqian Introduce Yourself 8 01-12-2010 10:51 AM
Hello from Dublin Ireland.. Peter Williams Introduce Yourself 4 11-24-2009 01:24 PM
Hello from Dublin piper Introduce Yourself 7 09-25-2009 02:32 PM


All times are GMT -4. The time now is 10:34 AM.


MobileRead.com is a privately owned, operated and funded community.