View Single Post
Old 10-01-2007, 06:52 PM   #38
bowerbird has been very, very naughtybowerbird has been very, very naughtybowerbird has been very, very naughty
Posts: 269
Karma: -273
Join Date: Sep 2006
Location: los angeles
actually, hadrien, i am extremely familiar with project gutenberg e-texts.
and the one thing i can tell you is that they're _consistently_ inconsistent.

so yes, some early books used all-caps for italics, rather than underscores.
and along the way, a variety of characters were used beside underscores...
and up until 2003 or so, when i became a severe pain-in-the-neck to them
on these issues, they didn't even feel any need to mark italics consistently...

even worse, they used all-caps for bold as well, and likewise felt no need
to be consistent with that either. (sometimes they didn't mark bold at all.)

i know all this because i have been working for some time now on means of
interpreting the p.g. e-texts in a way that restores the structural information.
the same type of work you do when you put texts into your database, except
i leave them as text. (so ordinary humans can continue to work with them...)

i've invented a form of non-markup markup -- i call it "zen markup language",
or z.m.l. (it's two steps more advanced than x.m.l.) -- where such structural
information is represented by a simple set of unobtrusive light-markup rules.

for instance, a regular chapter-header is preceded by 4 blank lines and followed
by 2 blank lines, thus allowing a viewer-application (which i've also programmed)
to automatically form a table of contents that is auto-hot-linked to the chapters...

other simple rules -- easy enough to be understood by a fourth-grader --
underlie all of the other structures that are commonly found in books...
you can see work that i've done, in action, by visiting this web-page:
you'll be particular interested in the "test-suite" and "rules" examples...

i believe intelligent viewer-programs intepreting plain-ascii input e-texts
and presenting them in typographically-sophisticated ways is _the_ future.

the publishing companies, of course, in an attempt to raise the cost of entry,
will try to force e-books into the complexity of heavy-markup, but i believe
the revolution into self-publishing will push back with light-markup systems.
authors don't want to battle steep learning curves. they just want to write...

bowerbird is offline   Reply With Quote