dalede said:
> curly quotes (double and single), curly apostrophes,
> and dashes that are really dashes.
curly-quotes and em-dashes, how could i forget those? :+)
oh well, like it said, it was a list off the top of my head...
***
jswolfe said:
> Isn't that up to the software
> doing the displaying of the file as to
> how it displays the page number?
yes sir, there is a mixture of things in the list,
some of which aren't relevant for all situations,
and some of which are geared to functionality,
not beauty. (except functionality _is_ beautiful.)
practically none of them are fully cut-and-dried.
for that particular item, i was thinking of .pdf,
and other formats of the fixed-page persuasion,
where the number of total pages is known for
any particular conversion (e.g., at textsize=12),
so as to be a basis for a relativistic comparison
with another conversion (e.g., at textsize=16)...
something like "page 180" has very little meaning
if we don't know if there are 200 or 800 or 1600
pages in any one particular conversion of a book.
> It should not be too hard to make such an app.
> to do some of what's needed an initial clean up.
actually, it's not really easy, for the simple reason that
p.g. e-texts have maddeningly inconsistent formatting...
even where there is a straightforward rule on something --
e.g., there should be 4 blank lines before a chapter heading
-- consistency checks weren't done to ensure that's the case.
so even something as relatively simple as finding headings
must be engineered to catch inconsistencies, and of course,
since you don't _know_ all the ways they were inconsistent,
it's not that simple to know what code you have to engineer.
the worst part of all -- as i'm sure you volunteers who have
converted p.g. e-texts already know -- is that p.g. employed
no method to inform us what lines should _not_ be rewrapped,
such as lines of poetry, lines in a table, lines in address-blocks,
and so on. finding and fixing these lines can be time-consuming.
and writing routines that can root them out is not entirely trivial.
my intention is to fix the files, and mount a mirror with my files.
michael hart graciously agreed to provide diskspace/bandwidth...
-bowerbird
p.s. i agree entirely that whatever beautification is done needs to
be subjected to the quality-control process of being read by people.
my hope is that the beauty of the files will be an alluring invitation.
for my part, i intend to make error-reporting and feedback _much_
more simple and responsive than it is with project gutenberg itself.
it will be more wiki-like, in that reports will be immediately visible...
|