MobileRead Forums - View Single Post

rogue_ronin · 12-29-2009, 10:40 PM

Quote:

Originally Posted by KevinH

...At first pass, I would try to stick with dc as much as possible when looking at extensions. Implementing an additional subset from the dcterms. namespace might be a good thing to do because the epub spec may grow to include them someday. And, using the DCTERMS.namespace means they can be easily mapped to "name", "content" pairs that can be stored and then passed through Sigil to the output opf file without really having to process them...

Hmmm... well, the DCTERMS namespace contains equivalents of all the DC namespace terms, so it sorta makes sense to migrate to that, doesn't it? (And as you are grabbing all the common DC/DCTERMS terms in Sigil, too, it wouldn't be incompatible.) Any conversion to ePub would be super-easy if there were only the DCTERMS... and am I mistaken in thinking that if I were to do a proper namespace declaration in the head, that I could include both? (Probably necessitating redundancy, but that's not a big deal.)

I'm saying this, however, in an effort to come to the simplest, coherent list -- thus preferring to stick with one namespace, probably DCTERMS. Some of your suggestions below support this thought.

(I'm going to take some of your response out of order, here...)

Quote:

One other thing... I do not think we should keep track of concurrent versioning information in the meta data of the ebook.

So for example:

<meta name="FileName" content="FILENAME.EXT" />
<meta name="FileVersion" content="VERSION NUMBER/NAME" />
<meta name="FileScanner" content="NAME" />
<meta name="FileComment" content="COMMENT" />

These should not be in "released" eBooks. They should in fact be tracked by the concurrent versioning system used to keep track of editing changes and things *before* a version of the book is released.

I can see your argument regarding versioning. It does make sense to keep the version info in the content management software. I keep it in my files because it is easy to grab that info along with everything else when I open a project.

And that's probably because I don't keep old versions, though I do keep a list of former versions and modifications in the metadata. I also include a "version guide" in the metadata, that shows what the versions actually mean (so that they don't just have an arbitrary, undefined "improved" value.)

EG:

Code:

	<!-- BEGIN: FILE HISTORY -->

	<!-- Created on 2009-12-28 -->
	<!-- Revision # 0.10 on 2009-12-28 -->
	<!-- Current Revision # 0.50 on 2009-12-28 -->

	<!-- END: FILE HISTORY -->

	<!-- BEGIN: REVISION GUIDELINE -->

	<!--
	0.10 :: Initial Conversion
	0.20 :: Cover and Frontispiece
	0.30 :: Sections, Chapters and TOC
	0.40 :: Endnotes and/or Blockquotes
	0.50 :: Initial Spellcheck
	0.60 :: M-Dashes, Hyphens and Ellipses
	0.70 :: Italics, Bold, and Pre-Formatted Text
	0.80 :: Reading Proof
	0.90 :: Checked Against Canonical Source
	1.00 :: Final Version = Optimal
	1.++ :: Minor Error Corrections
	-->

	<!-- END: REVISION GUIDELINE -->

Still, having a modified date in the metadata is kind of the same thing, isn't it -- it just doesn't give you a sense of progress, or perfectedness, does it?

If one's use-model includes people sharing files and improving them (as mine does), modified dates may not be enough though, to indicate the relative value of individual files. Sort of like software version numbers -- those are a ready measure of a type of value. It's also something that happens "in the wild."

FileNames may not matter -- it can usually be determined by a system call of some sort (but is there a case for knowing the original filename? It might allow for auto-renaming of related files such as images... Too speculative?)

FileScanner will probably have to wait for a MARC Relator Code to catch up to reality.

And FileComment -- I guess I keep thinking how and/or why a file/book has been created might be interesting or relevant (to research, or somesuch.) It's something that I use to auto-generate a Colophon, too, where such info is often expected. Don't Project Gutenberg texts often include such comments?

Note that a lot of my reasoning has to do with the file being encountered by someone Not-The-Producer, and suggesting to that N-T-P ways to keep a good, comprehensible accounting of their own work, as well as giving them as excellent a context as possible for understanding the current file.

Regarding your observations on the DCTERMS namespace:

Quote:

Originally Posted by KevinH

These might include things like:

DCTERMs.abstract - "a short summary of the resource"
although this may be superseded by DC.description

DCTERMS.alternative - "alternative title"

DCTERMS.audience - "audience the resource is intended for" (ie.. children, vs adult or PG-13 or Teens or ...)

Additional date events to record:

DCTERMS.dateAccepted - "date of acceptance of the resource"

DCTERMS.dateCopyrighted" - "date of copyright"

DCTERMS.dateSubmitted - "date of submission" (ie. for a thesis or dissertation")

Additional license qualifiers:

DCTERMS.license - "license to use the resource" (public domain, etc)

DCTERMS.provenance - "statement of changes in ownership"

DCTERMS.rightsHolder - "person or org who owns the rights"

DCTERMS.accessRights - "who can access it, security status"

And the following two fields that qualify "Coverage":

DCTERMS.spatial - "spatial or geographic coverage"

DCTERMS.temporal - "time period covered"

Although you could argue that the standard DC.coverage is enough

And only the most basic "Relation" qualifiers:

DCTERMS.hasPart, DCTERMS.isPartOf

The hasPart can be the number in the Series, and isPartOf can be the "Series" name itself

DCTERMS.hasVersion, DCTERMS.isVersionOf

To allow support for different versions of the same book. Think "Jules Verne's Journey to the Centre of the Earth" - the original English translation versus a modern translation to English directly from French. They are very very different - in fact many older translators took "extreme liberties" to "enhance" the book they were translating for their audience.

All of the above would only be "passed through" Sigil and not processed for editing and things.

I think this is really good stuff. I'm going to take the above ideas and see if I can generate a set of (relatively) simple meta tags in the next post I make.

I'll separate it into a variant of my current scheme, and some possible additions based on your suggestions here.

Quote:

Then I think we have to go outside the DCTERMS namespace but only for the select few that really matter most:

Something along the lines of

YOURNAMESPACEHERE.name content="something".

I think the smallest most relevant subset should be the goal and not trying to replace all of the information from the card catalog system would be best.

I'm with you there, but making our own namespace! I just started XHTML a few months ago...

And if we're going to make our own namespace, we should just do everything to our own satisfaction, and ditch the DC stuff (or rather, steal freely and scratch off the serial numbers.)

From my list there's not much remaining, though, so it probably isn't necessary, unless you have some further suggestions. I don't want to recreate the entire card catalog, either, but we probably already have done most of one via DC/DCTERMS.

That file-as attribute is still a kneebiter, though.

Give me your comments on the next post if you still have interest! (That post will be up in an hour or two, I think.)

m a r