Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 11-08-2007, 04:12 PM   #106
jbenny
Addict
jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.
 
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
Quote:
Originally Posted by DaleDe View Post
Well to get back to the original theme I just finished reading a gutenberg book that was actually in fairly good shape. But even so it had some annoying problems still in it after I have gone through and beautified it once.

These included: punctuation without spaces. two sentences run together with a period and no spaces after the period. Spelling checkers are a great tool to find problems in scanned books but some of them won't find this since they have been taught (programmed) to ignore words of this kind since they might be filenames.

The second problem was paragraph splits where they didn't belong. The sentence was not over and the new paragraph started with a small letter. It should not have been a paragraph split.

Hopefully a program could detect this sort of thing.

Dale
Dale, GutenMark will take care of a lot of these types of problems with PG texts. Some of them can be fixed with a decent text editor with search/replace capability (regular expressions would work even better for some issues). No matter which software you use, a human being will still have to proof the result, if you want it perfect.
jbenny is offline  
Old 11-08-2007, 04:15 PM   #107
bowerbird
Banned
bowerbird has been very, very naughtybowerbird has been very, very naughtybowerbird has been very, very naughty
 
Posts: 269
Karma: -273
Join Date: Sep 2006
Location: los angeles
kovidgoyal said:
> To summarize your response,
> zml cannot support setting presentational aspects.

ok, _you_ don't get it. i'm sorry about that.

but i think it's sufficiently clear to _other_ people.

so i won't prolong the discussion. i'll explain it again,
one more time, but after that, i'll leave you in the dark.
because it's not really important if _you_ get it or not...

the _philosophy_ of z.m.l. puts the _locus_of_control_
for _presentational_matters_ into the hands of the reader.

so, the things that would fall within the purview of c.s.s.
-- in an xml/css world -- are found in the _zml-viewer_,
_not_ the file-format. if you look for things like drop-caps
in the _file-format_, you're just looking in the wrong place.
(and, because of that, you won't find them there. surprise!)

this is part of a much bigger _philosophy_ that it is far more
efficient -- in the long-run -- and a much better _strategy_
to put intelligence into our _applications_, not our _formats_.
the problem with putting smarts in the _format_ is that you
have to then mold the content to the format, whereas if we
put the smarts in the _apps_, they'll parse the raw content...
as before, this is far too big a concept for us to _discuss_,
so i'm only just laying it out, because we cannot "decide"
the issue here, that's for the real-world to do, but i thought
some lurkers might be interested in the "big picture" of that.


> HTML+CSS can support both structural
> and presentational aspects.

so can zen markup language.

the structural aspects are in the file-format, and
the presentational options are in the viewer-app.


> They give the author more control and more freedom.

they give more control. they don't give more "freedom".

some people will say z.m.l. gives them the freedom to
avoid doing the unpleasant (to them) task of markup...
it is those simplicity-loving people i wish to empower.
but control-lovers who prefer xml/css can still use that.

there are certainly some authors out there who want to
control the reading experience of their audience. fine!
i have no beef with 'em. really! if you will kindly notice,
i have said that here on these boards, i am one of them!
i want to control the linebreaks that people see when they
read my posts. so i make it so they don't have a choice...
but you will also kindly notice that lots of people resent it.
(ok, maybe only _some_ people, but they resent it _loudly_.)

this divide -- between how much control an author wants
to exert over the experience of the product of their art --
already exists in the world of e-books today. some authors
are happy to make their text available so readers can mold it
into whatever form the readers want. other authors _insist_
on using .pdf, so they can control what every page looks like.

i don't tell authors which way is wrong or right. i don't care!

what i _am_ saying is that, if you're one of those authors who
is willing to hand control over to the reader, i've got a format
that makes your job of being an author _much_ easier for you.
if some authors like that, fine. if a _lot_ of authors like it, fine.
if no authors like it, fine. it doesn't make any difference to me.
my paycheck will be the same either way.


> zml is forcing restrictions on authors.

wrong. it is true that authors cannot use z.m.l. to deliver
custom-formatted books. but many authors do not care.

if an author feels that the "standard look" of a zml-book
crimps their style and "forces restrictions" on them, fine,
they're totally free to go elsewhere and use another method.


> Indeed your whole attitude is that authors dont know
> whats good for them and you're going to tell them that.

no, my attitude is that some authors don't want to do markup,
so i'm gonna give them a simple format so they don't have to,
but can nonetheless provide their readers with e-books that are
both powerful and beautiful.


> If you want to encourage authors/digitizers to use
> only structural markup, a better approach would have been

well, thanks for the suggestion. but as you can probably tell,
i already have some very firm ideas about what i want to do...

so i'm not really soliciting your suggestions... :+)

-bowerbird
bowerbird is offline  
Advert
Old 11-08-2007, 04:27 PM   #108
bowerbird
Banned
bowerbird has been very, very naughtybowerbird has been very, very naughtybowerbird has been very, very naughty
 
Posts: 269
Karma: -273
Join Date: Sep 2006
Location: los angeles
dalede said:
> Well to get back to the original theme

hallelujah! :+)


> I just finished reading a gutenberg book
> that was actually in fairly good shape.
> But even so it had some annoying problems still in it
> after I have gone through and beautified it once.

that happens...


> These included: punctuation without spaces.
> two sentences run together with a period and
> no spaces after the period.

yeah, those are pretty common problems,
especially in e-texts that were done early on.


> The second problem was paragraph splits where
> they didn't belong. The sentence was not over and
> the new paragraph started with a small letter.
> It should not have been a paragraph split.

although i haven't had very good luck from doing it,
the standard suggestion is that you report the errors.
maybe they'll get back to you, or maybe they won't...
and maybe they'll fix the errors, or maybe they won't.
the e-mail address for reports is "errata@pglaf.com".

i built a public error-reporting capacity right into
every _page_ of my library. i believe it's important.
i offered it to p.g., but they weren't interested. ok.


> Hopefully a program could detect this sort of thing.

punctuation without spaces? sure thing.
two sentences run together with a period? yep.
no spaces after the period? easy to locate.

all these checks -- and a lot more -- are in the
programs that i've written to do o.c.r. clean-up.

-bowerbird
bowerbird is offline  
Old 11-08-2007, 04:29 PM   #109
bowerbird
Banned
bowerbird has been very, very naughtybowerbird has been very, very naughtybowerbird has been very, very naughty
 
Posts: 269
Karma: -273
Join Date: Sep 2006
Location: los angeles
jbenny said:
> No matter which software you use,
> a human being will still have to proof the result,
> if you want it perfect.

on the other hand, if you want it "perfect",
it's best not to rely on a human being...

-bowerbird
bowerbird is offline  
Old 11-08-2007, 04:40 PM   #110
TadW
Uebermensch
TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.
 
TadW's Avatar
 
Posts: 2,583
Karma: 1094606
Join Date: Jul 2003
Location: Italy
Device: Kindle
Quote:
Originally Posted by bowerbird View Post
kovidgoyal said:
> To summarize your response,
> zml cannot support setting presentational aspects.

ok, _you_ don't get it. i'm sorry about that.

but i think it's sufficiently clear to _other_ people.

so i won't prolong the discussion. i'll explain it again,
one more time, but after that, i'll leave you in the dark.
because it's not really important if _you_ get it or not..
Why are you like this? Sorry, I don't get it. Just continue with your aggressive attitude towards highly respected MobileRead members, and for sure you'll be on everyone's ignore list - any time soon.
TadW is offline  
Advert
Old 11-08-2007, 04:40 PM   #111
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,376
Karma: 27230406
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by bowerbird View Post
kovidgoyal said:
> To summarize your response,
> zml cannot support setting presentational aspects.

ok, _you_ don't get it. i'm sorry about that.

the _philosophy_ of z.m.l. puts the _locus_of_control_
for _presentational_matters_ into the hands of the reader.
Surely you can make the leap to the next logical step. If a file defines only structural elements, the only elements that the viewer app has control over are those structural elements. Not all elements in an ebook are structural. Occassionally, there is a need for special formatting for isolated instances. zml + viewer will NOT handle this.

Quote:
Originally Posted by bowerbird View Post
this is part of a much bigger _philosophy_ that it is far more
efficient -- in the long-run -- and a much better _strategy_
to put intelligence into our _applications_, not our _formats_.
the problem with putting smarts in the _format_ is that you
have to then mold the content to the format, whereas if we
put the smarts in the _apps_, they'll parse the raw content...
as before, this is far too big a concept for us to _discuss_,
so i'm only just laying it out, because we cannot "decide"
the issue here, that's for the real-world to do, but i thought
some lurkers might be interested in the "big picture" of that.
Umm so you're stating a philosophy as a motivation for the use of lightweight markup and then refusing to discuss its merits?

Quote:
Originally Posted by bowerbird View Post
> Indeed your whole attitude is that authors dont know
> whats good for them and you're going to tell them that.

no, my attitude is that some authors don't want to do markup,
so i'm gonna give them a simple format so they don't have to,
but can nonetheless provide their readers with e-books that are
both powerful and beautiful.

Again a better way to give authors this power is to create an authoring tool, not a file format.
kovidgoyal is offline  
Old 11-08-2007, 05:09 PM   #112
bowerbird
Banned
bowerbird has been very, very naughtybowerbird has been very, very naughtybowerbird has been very, very naughty
 
Posts: 269
Karma: -273
Join Date: Sep 2006
Location: los angeles
tadw said:
> Why are you like this?

like _what_? i said he doesn't get it. because he _doesn't_. what _should_ i do?

and i said i don't care if _he_ gets it or not. because i don't. why should i?

because it's senseless to explain it over and over. it just drives everyone away.

-bowerbird
bowerbird is offline  
Old 11-08-2007, 05:11 PM   #113
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,376
Karma: 27230406
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by bowerbird View Post
tadw said:
> Why are you like this?

like _what_? i said he doesn't get it. because he _doesn't_. what _should_ i do?

and i said i don't care if _he_ gets it or not. because i don't. why should i?

because it's senseless to explain it over and over. it just drives everyone away.

-bowerbird
Be polite. And if you dont care whether people get what you're saying dont post on public forms.
kovidgoyal is offline  
Old 11-08-2007, 05:24 PM   #114
bowerbird
Banned
bowerbird has been very, very naughtybowerbird has been very, very naughtybowerbird has been very, very naughty
 
Posts: 269
Karma: -273
Join Date: Sep 2006
Location: los angeles
koyalgovid said:
> If a file defines only structural elements,
> the only elements that the viewer app
> has control over are those structural elements.

maybe viewer-apps in the way that _you_ conceive them
are such that they only follow directions given by the file.

but apps of the type that _i_ am building are not so dumb.
they will know a lot more than whatever the file tells them.


> Not all elements in an ebook are structural.
> Occassionally, there is a need for
> special formatting for isolated instances.

and here we are, once again, with the vague handwaving
about something that _might_ be needed _sometime_...

come up with something concrete in an actual p.g. e-text.

i've looked at those e-texts, lots and lots and lots of them,
and everything that i've seen in them, i know that i can do...

but, you know, i haven't exhaustively examined every one,
so if you can find something that my system cannot handle,
one way or another, i'll be quite happy to say "thank you"...
and then i'll go modify my system so it _can_ handle that...

but until you can do that, though, stop the vague handwave.


> zml + viewer will NOT handle this.

will not handle _what_? your imaginary boogieman? so what?


> Umm so you're stating a philosophy as a motivation
> for the use of lightweight markup and then
> refusing to discuss its merits?

look, i'm not asking you to _buy_ anything.
so there's no need to "discuss the merits"...
i just laid it out in case people were curious.

as i have said before, and will surely say again,
the proof is in the pudding. it's totally senseless
to debate _whether_ something will work or not.
build it, and if it works, it will be obvious to all...
and if you can't build it, or it doesn't work, then
_that_ will be equally obvious to all. talk is cheap.
working code is the standard i need for convincing.

and i'm writing that code myself, not asking you.
so don't waste my time "discussing the merits..."


> Again a better way to give authors this power is
> to create an authoring tool, not a file format.

i am creating _both_. and a whole lot more to boot.
i've pointed you and others to all kinds of my work.
if you have any criticism of _that_, i am all ears...
but i'm completely done with the vague handwaving.

-bowerbird
bowerbird is offline  
Old 11-08-2007, 05:31 PM   #115
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,376
Karma: 27230406
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
@bowerbird

To remind you, I actually said that zml would be a good fit for p.g. txt files. But, for the umpteenth time, not for a general ebook format.

As for specific examples, I gave you specific examples which you dismissed as "being from CSS". So your attitude seems to be, if you're given a specific example you say "zml wont handle that because its specific". If you're then told that support for custom formatting of ebook elements is a feature missing from zml you say "I dont want to listen to you because you're not being specific".
kovidgoyal is offline  
Old 11-08-2007, 05:38 PM   #116
bowerbird
Banned
bowerbird has been very, very naughtybowerbird has been very, very naughtybowerbird has been very, very naughty
 
Posts: 269
Karma: -273
Join Date: Sep 2006
Location: los angeles
you're making this boring for the lurkers again. and being dishonest to boot.

if you want to say, "it's not good for a general e-book format", you'll have to
give a concrete reason why, or i'm not even going to bother to make a reply.

and i didn't "dismiss" your examples, i told you exactly how those things will
be handled in the z.m.l. environment, i.e., as user-options in the viewer-app.

custom-formatting by the _author_ is expressly _not_ supported by z.m.l.,
because the z.m.l. philosophy expressly gives presentation to the _reader_.

and i'm sure i've said all of that _clearly_enough_ so that a rational person
understands it perfectly well. and if i haven't, then maybe i just can't do it.
in which case people will have to pick it up intuitively when they use my apps.

-bowerbird
bowerbird is offline  
Old 11-08-2007, 05:47 PM   #117
GregS
Zealot
GregS has a complete set of Star Wars action figures.GregS has a complete set of Star Wars action figures.GregS has a complete set of Star Wars action figures.GregS has a complete set of Star Wars action figures.
 
Posts: 107
Karma: 308
Join Date: Oct 2007
Location: Perth Australia
Device: EZ Reader 5", Iliad
I have just found this thread, and have only skimmed through a portion of it - I will read it more carefully this afternoon. Forgive these comments that may be off-topic.

Clearly marking up is the answer, but should it be dictated by ebook formats at all?

Gutenberg (the biggest project of its kind), is not and should not be seen simply as a resource for current ebooks. It is a resource of incredible value for many things yet to be seen. But the problem is that it is anchored in its past. Other collections are in html, but the variety of application proves problematical.

If you think novels are a problem, think about plays and poetry collections. Think also of the need to transform text into Voice Synthesised readings, the problem of reference quoting etc.,. the list of what may be wanted to be read, heard or otherwise used only gets more complex and unpredictable as readers become more widespread, and other means of dealing with literature are developed.

I would propose that the Gutenberg problem does not lie in marking up for ebooks, but rather a markup that allows easy translation to things like epub (a very good move).

It is not a matter of light vs heavy markup.

It is matter of finding a light markup that can be transformed coherently and consistently into heavy markup, they may include voice markup, reference markup, and complete structural markup, that is potentially well beyond what any present reader can handle.

Yet at the same time can be used in a minimalist fashion and allow greater complexity to be added by future editors.

I would suggest, that TEI (text Encoding Initiative) is the only candidate.

However, anyone looking at it would faint from apparent complexity of what could be done.

TEI.lite is only lite from a scholar's perspective.

However, it should be possible to prepare a consistent sub-standard compatible to translation to epub for instance.

So why bother? Why not just use something like epub?

The reason is that as a document is edited over time and more and more elements are placed in it the thing has to be consistent. It is easy to substitute the main element names ect., to say epub, it is just as easy to ignore all else (element wise), by simple filtering.

It is not so simple to add in elements into a more restrictive scheme - that is the primary problem. It must be a system that allows for growing complicity over-time.

I believe there is only one candidate. However, it needs to have simply implemented templates and there is no reason why the base markup should not be designed specifically for translation into existing ebook formats, or indeed good formats not yet used.

Now if this is done well there is no reason why source text markup cannot be translated on site as part of the download process. So instead of keeping at projects like Gutenberg multiple file types, it keeps one file type (TEI. ultralite) and translates on the fly what a reader may like to use (including varieties of plain text).
GregS is offline  
Old 11-08-2007, 05:47 PM   #118
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,376
Karma: 27230406
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
@bowerbird
As I've stated repeatedly, it is not good for a general ebook format because it does not support custom formatting of individual elements. As you stated quite clearly in you last post.

Again, giving control of presentation to the reader software is in general a good thing. Forcing all presentation to be done only by the reader software, is not. It is limiting and short sighted.

@GregS

I agree, for archival purposes, of books that need to be digitized, a lightweight system that has standard structural elements is the way to go. Like you, I think it should be rooted in some sort of system that can be simplified considerably, but that is extensible, to allow for the bells and whistles that authors like to add.

Last edited by kovidgoyal; 11-08-2007 at 05:55 PM.
kovidgoyal is offline  
Old 11-08-2007, 06:15 PM   #119
bowerbird
Banned
bowerbird has been very, very naughtybowerbird has been very, very naughtybowerbird has been very, very naughty
 
Posts: 269
Karma: -273
Join Date: Sep 2006
Location: los angeles
kovidgoyal said:
> it does not support custom formatting of individual elements

stop misrepresenting the facts. it's dishonest.

z.m.l. supports custom-formatting of individual elements by _the_reader_.
you know, all those people who are actually _reading_ the book...

it does _not_ support customization by the _author_, nor does it _require_ it,
which might well appeal to writers who want to _write_ and not do _markup_.

the choice as to whether or not to use z.m.l. will be made by the author.
not me. not you. not kovidgoyal. nobody, except the author. thank you.

-bowerbird
bowerbird is offline  
Old 11-08-2007, 06:20 PM   #120
RWood
Technogeezer
RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.
 
RWood's Avatar
 
Posts: 7,233
Karma: 1601464
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
You say the discussion period is over bowerbird, you said almost the same thing several years ago. The time for "putting up" or "putting out" is now or never. The few examples you presented to us in HTML 4 on your web site do not convince us of the power of ZML, if anything it makes me believe that ZML is just a pipe dream in you mind.

You stated some years ago that you would open the sources and even claimed that ZML was developed under an open source license. That would lead one to assume that since you now will not release the source, a working source does not exist.

You have prattled on for a while here at MobileRead claiming victim status because everyone is picking on you. I can honestly tell you that everyone is not picking on you. Many have just set their defaults to ignore your posts because they have better things to do with their lives than listen to your unsupported boasts about technical, moral, and artistic superiority.

Now kovid, who you claim has little knowledge of formatting, conversions, or even ebooks, has developed a set of programs called libprs500. I have used it for many months with great results. Fictionwise has also adopted it and uses it for all of the Sony LRF formatted books they offer -- over 6,000 at last count and growing daily.

If, as you say, "the proof is in the pudding," then you several year old pudding has gone rancid.
RWood is offline  
Closed Thread


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
The "Closed Circle" is open for business pholy Deals and Resources (No Self-Promotion or Affiliate Links) 0 12-20-2009 09:24 PM
"SuperBook" project - British School studies e-books usage TadW News 2 06-28-2007 10:46 PM
Introducing the book: Gutenberg offers "in-home" tech support (humor) nekokami Lounge 1 05-07-2007 08:40 PM
"Gutenberg 2.0: le futur du livre" / iRex demoes Mobipocket on iLiad Hadrien News 4 03-27-2007 11:45 AM


All times are GMT -4. The time now is 06:45 AM.


MobileRead.com is a privately owned, operated and funded community.