View Single Post
Old 11-08-2007, 04:27 PM   #108
bowerbird
Banned
bowerbird has been very, very naughtybowerbird has been very, very naughtybowerbird has been very, very naughty
 
Posts: 269
Karma: -273
Join Date: Sep 2006
Location: los angeles
dalede said:
> Well to get back to the original theme

hallelujah! :+)


> I just finished reading a gutenberg book
> that was actually in fairly good shape.
> But even so it had some annoying problems still in it
> after I have gone through and beautified it once.

that happens...


> These included: punctuation without spaces.
> two sentences run together with a period and
> no spaces after the period.

yeah, those are pretty common problems,
especially in e-texts that were done early on.


> The second problem was paragraph splits where
> they didn't belong. The sentence was not over and
> the new paragraph started with a small letter.
> It should not have been a paragraph split.

although i haven't had very good luck from doing it,
the standard suggestion is that you report the errors.
maybe they'll get back to you, or maybe they won't...
and maybe they'll fix the errors, or maybe they won't.
the e-mail address for reports is "errata@pglaf.com".

i built a public error-reporting capacity right into
every _page_ of my library. i believe it's important.
i offered it to p.g., but they weren't interested. ok.


> Hopefully a program could detect this sort of thing.

punctuation without spaces? sure thing.
two sentences run together with a period? yep.
no spaces after the period? easy to locate.

all these checks -- and a lot more -- are in the
programs that i've written to do o.c.r. clean-up.

-bowerbird
bowerbird is offline