View Single Post
Old 11-04-2007, 03:24 PM   #54
bowerbird
Banned
bowerbird has been very, very naughtybowerbird has been very, very naughtybowerbird has been very, very naughty
 
Posts: 269
Karma: -273
Join Date: Sep 2006
Location: los angeles
kovidgoyal said:
> There is absolutely no reason why
> a converter cannot be developed that
> handles most of the iconsistencies correctly.

i agree. in fact, i've developed that converter.


> Your problem seems to be that
> you aim for perfect conversion of all texts.

ok, here's the thing. why "handle" inconsistencies
when you can _remove_inconsistencies_entirely_?

i intend to mount a mirror of the p.g. library which
has all of their inconsistencies removed, so that
no other developers have to deal with that rubbish.

in other words, i'm doing what the "whitewashers"
at project gutenberg should have done all along,
i.e., ensured that their e-texts were _consistent_.


> That's never going to happen.

a perfect converter that handles all inconsistencies
might not happen, but we don't really need _that_.

we need a darn-good converter to clean up _most_
of them, and then we need to be _diligent_ about
finding and correcting inconsistencies that remain...

at the point where you have lots of developers who
are adding value to the library with new features
-- features that will depend on consistent e-texts --
the inconsistencies will reveal themselves naturally.


> And how does inventing a new
> lightweight markup language
> (when there are already tons of them
> out there) solve anything?

well, none of them seemed perfect enough for me.
specifically, they didn't seem "light" enough for me.
i want "zen" markup, maybe even "zero" markup...

even markdown, which is the best of the bunch,
often seems like an "abbreviated" form of markup,
and not the radical departure that i'm looking for...

and that became even more true when i factored in
the types of features that i wanted to be automatic.

for instance, i want the table of contents linked to
the chapter-headings automatically, with no work.
further, i want the chapter-headings linked _back_
to the table of contents, again without _any_ work.
plus, i want to let the users jump from one chapter
to the previous and next chapters, automatically...
even in the middle of a chapter, i want to let them
jump to the beginning of that chapter, and to the
beginning of the _next_ chapter, _automatically_...

i want a link from a footnote referent in the body
to its note in the notes section, automatically, and
i want an auto-backlink from there to the referent.
(and if there are two referents to the same note
-- it happens -- then i want auto-backlinks to both.)

and when there's a pointer-reference in the text,
such as a reference to "chapter 2", then i want for
that pointer-reference to be treated as a hotlink...

likewise, if there's a u.r.l., i want it to be a hotlink.

with the other forms of light-markup, you have to
code in all of those links manually. that's a pain...
avoiding such pain is the purpose of light-markup,
at least as far as i'm concerned. so i built my own.

plus, i did it as a puzzle, a challenge for my mind.
surely you can understand that? or maybe not...
because i just don't comprehend such questions...


> The gutenberg etexts are still going to have to
> be converted to that markup.

right. that's another reason i built my own version.
because i wanted it to be as close to "native" p.g.
as possible, to minimize the cost of bulk conversion.

as it is, the vast majority of most p.g. e-texts is
"already in" z.m.l. format. the big exception is
the front-matter at the top (e.g., the title-page).


> ANy converter written by somebody who knows
> what he's doing will be designed to represent
> semantic information internally using an object
> model, then adding output formats will be trivial.

i don't know what "an object model" is.

and frankly, i don't really care, not in the slightest,
since "adding output formats" is not a big concern.

and evidently i don't even need to know what it is,
because i've been able to do conversions just fine.


> 1. You think of html as "heavy" markup.

actually, i judge html as "medium" markup.
you have to jump to xml/css to be "heavy",
and go to .tei or docbook if you're serious.
but i dunno, maybe you are not "serious"...


> Not everyone is as limited.

nope. just 92% of the population. my user-base,
as i refer to them. i'm content to give up the rest.
heck, i'll be happy with "authors who wanna write,
and not have to waste time doing stupid markup."


> 2. I'd have no problem with lightweight markup
> if all I cared about was simple texts with
> headings a few links and some images.

evidently you haven't looked at my test-suite.

i can handle all the features commonly found in
the p.g. e-texts, and indeed in almost all books...

and when i discover a need for new capabilities,
i just invent a way for the format to handle it...
(and that's the _easy_ part. the difficult part is
coding the viewer-program for the new feature.)

and frankly, what i can't handle, i don't need...


> I don't want my documents limited to
> the very small set of features imposed by
> lightweight markup.

well, when you say _that_, you're just betraying
that you don't have a clue about light-markup...

(and, by the way, we do call it "light markup",
not "lightweight markup", because "lightweight"
implies what you are trying to say directly here,
i.e., that it is "limited" in some way, and it's not.)

markdown, for instance, lets you include _any_
(x)html code right in your markdown document
-- it just passes it on through without treating it --
so there's absolutely _nothing_ that you cannot
include, so there is no "very small set of features"
that is being "imposed" on you by the framework.

but even aside from that, the number of things
which cannot be handled within the _standard_
markdown framework is quickly vanishing away.

and if you include the additions to the standard
being implemented by stuff like multimarkdown,
you will find that you encounter no "limitations".

no offense intended, but if you want to criticize
light-markup, you will need do some homework.

-bowerbird
bowerbird is offline