View Single Post
Old 11-07-2007, 11:09 PM   #76
Panurge
Enthusiast
Panurge has a complete set of Star Wars action figures.Panurge has a complete set of Star Wars action figures.Panurge has a complete set of Star Wars action figures.Panurge has a complete set of Star Wars action figures.
 
Panurge's Avatar
 
Posts: 34
Karma: 336
Join Date: Dec 2006
Location: Texas
Device: Sony Reader
Kovidgoyal: [EDIT: An example from physics research articles. A resolution of sections is usually sufficient. i.e. people refer to section so-and-so of paper so and so.
I don't know if that is sufficient resolution in general though.]

Yes, I think that "resolution" is the problem. Paragraph numbers would probably work well for everything but poetry, though in some cases--such as the one you mention--larger units might be more practical. Page numbers work if one can pinpoint the exact edition (publisher, place, date, in addition to title and author) being referenced; that was the contribution of printing. For manuscript copies, logical divisions such as sections or paragraphs or line numbers (for verse) were the only alternative. But are such things needed for electronic documents that can be searched for exact phrases? Presumably not. So long as one can identify the electronic source one is referring to, searching would suffice. But there's the rub. There is no system of cataloguing material that is purely electronic in origin. The URL of a web site, for instance, is an unstable identifier, as we have learnt very quickly in the last decade or so. Printed books have that data, but what kind of unique identifier do electronic documents offer? There's no central clearing house, no Library of Congress or OCLC (the online cataloguing authority for books) or ISBN number as of yet.
When Michael Hart (an academic) started Project Gutenberg, he seems to have encouraged embedded page numbers in ASCII text for the reasons we've already discussed. So far electronic documents are a sort of free-floating, indistinct mass of various kinds of information. Without some standards of granularity or resolution, research will become too unwieldy; the Internet search engine demonstrates the problem all too well.
As a librarian (rather than as a programmer, who finds useful and efficient ways of designing specific solutions), I have to worry about this sort of thing increasingly. Page numbers are, in a manner of speaking, the tip of the iceberg.
Speaking of Google books (which BowerBird mentions above), shouldn't someone point out to them that the scanning is being rather carelessly executed? I keep running into instances of books that are so poorly positioned that part of the text is cut off, to say nothing of the page numbers.
--------------------------------------------------------------------------------
Panurge is offline   Reply With Quote