Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 11-06-2007, 02:32 PM   #1
jbenny
Addict
jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.
 
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
Page numbers in ebooks for scholarly research?

Something that Panurge brought up in another thread needs further discussion. It was getting lost in the other thread, so i am copying the relevent parts here, in a new thread (see below).

Although the average ebook reader may not care about Panurge's question, remember that ebooks aren't just for entertainment. As more of the world's libraries become digitized, ebooks will be used by professionals, as well as casual readers. In fact, the digitized version will make it easier for everyone to access books that would otherwise be difficult or impossible to obtain.

If you have a suggestion on how to handle this issue in an ebook, without also intruding too much on the casual reader, please speak up. Also, please let's not wander off into left field about non-relevent topics (yes bowerbird, I mean you). This discussion is about using currently popular ebook formats, not your zml. I would appreciate it if you would limit your suggestions appropriately.

My responses to Panurge were focused on XHTML as it is used in epub, but if you have a suggestion on how to do things in Mobipocket, LIT, PDF, LRF, or other popular ebook formats, suggest away.

-------------------

Panurge:

> 14. don't put pagenumbers inside the text/paragraphs.

For the casual reader, this may not be an important point, but for someone who publishes scholarly texts, which require documentation, it is. The page numbers of the original text do matter, as does the exact text that lies between them.

[snip]

But what really matters for scholars who have to show in their footnotes where to locate the authority for the text they cite, a lack of representation of the pagination of the original renders the e-text useless.

[snip]

At the same time, we who are scholars have to decide whether or not the original print text-source is what we're going to refer to or the e-text facsimile. If the latter, do we regard it as a new edition or as a faithful representation of the print copy? If we don't account for these needs in our re-encoding now, we'll simply have to redo the e-texts in the future if we expect electronic texts to gain much of a oothold in the world of scholarship and education.
-------------------

jbenny:

You bring up a very valid point that most of us don't think of (me included). Can you suggest a way to handle this without having the page numbers in-line with the text? Most of us would find the visible page numbers too obnoxious.

[snip]

For XHTML markup, one thing that comes to mind (just off the top of my head) would be to enclose all the text that makes up an original page with a surrounding tag that uses the "id" attribute to hold the page number.
-------------------

jbenny:

http://www.mobileread.com/forums/att...1&d=1194151963

The content is totally bogus. I just made it up for this test. I used a <span> tag to mark the beginning few words of each page. Since a physical page is likely to fall mid-sentence, you can't use a block-level tag like <div>. Well, you could, but that would also break a sentence in the ebook, which is not what you want.

[snip]

This is far from an ideal method, but it was the first thing that I tried. Perhaps someone has a better suggestion? How to delimit the page breaks for those who need them, while not being in-your-face for the average ebook reader? In a web browser, some javascript could make this a lot easier. However, I don't know of any ebook readers that do javascript (not counting PDAs).
-------------------

Panurge:

Some such solution might satisfy everyone. Current scholarly journal databases such as Project Muse give the page numbers in square brackets within the text--an "ugly" solution, I suppose, but a simple one. JSTOR, the dominant archive of scholarly journals takes a different tack. It uses searchable PDF files and presents a scanned graphic representation of the original journal page, so the pagination problem is not an issue. However, the downloaded PDFs don't look all that great on the Sony Reader, though they are usable.
-------------------

jbenny:

Although neither is ideal, both methods could easily be done in an epub ebook. The first would be very simple, but "ugly" as you say. Including a scanned image of each page (PDF, PNG, JPG, etc.) that is linked from the XHTML text is also possible. This would of course make the epub much larger and more work to construct.

I haven't had the time to think about other ways to do this, but there is probably a good way to do this strictly in XHTML, without having to include scans or put visible page numbers in the text. Perhaps someone else can suggest something?

Last edited by jbenny; 11-06-2007 at 02:35 PM.
jbenny is offline   Reply With Quote
Old 11-06-2007, 02:36 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,820
Karma: 5006091
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Just to make sure I understand the isuue. You want a way of locating an arbitrary object in a ebook file (it could be a sentence, a table, a figure etc) unambiguously.

Now we come to the question of resolution. What is the smallest object you are satisfied with being able to reference? A paragraph, (a page at some rendering resolution?). Note that using pagenumbers from printed versions is not good enough as in the future there may not be printed versions.

EDIT: An example from physics research articles. A resolution of sections is usually sufficient. i.e. people refer to section so-and-so of paper so and so.
I don't know if that is sufficient resolution in general though.

Last edited by kovidgoyal; 11-06-2007 at 02:40 PM.
kovidgoyal is offline   Reply With Quote
Old 11-06-2007, 02:43 PM   #3
jbenny
Addict
jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.
 
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
Actually, Panurge was asking for something a bit simpler. He just wanted a way to correlate the text of an ebook to what page it originally came from in the scanned book. You see the problem of citing a reference in the conventional manner, which uses page numbers.

Going further and being able to reference any text or object in an ebook in a standard manner might be even better for some purposes.

The real issue is how to do any of this without having extra text or markup visible to the casual reader? As I suggested in my previous comments, using javascript would make this a lot easier, but ebook readers don't usually have that capability.
jbenny is offline   Reply With Quote
Old 11-06-2007, 02:45 PM   #4
jbenny
Addict
jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.
 
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
Your point about page numbers perhaps disappering in the future is well taken and very likely. However, for quite a while, we would still have to deal with page numbers and references to them.
jbenny is offline   Reply With Quote
Old 11-06-2007, 02:58 PM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,820
Karma: 5006091
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Umm is that really important? Assuming the scans are available online and someone comes acroos a reference like "in some book on page 234" and they want to look up the reference can't they just access the scanned book from say google books and look it up?

My point is that this is a rare usage scenario so perhaps the better solution is no solution. Just making scanned copies available freely online should be sufficient.

Now addressing actual ways of doing this is you disagree with me. Perhaps this should be left up to the reader apps. i.e. use some semantic tagging of content to indicate which page it comes from, then if the user selects "reference mode" in the reader app, the reader should be responsible for displaying the reference information.
kovidgoyal is offline   Reply With Quote
Old 11-06-2007, 03:25 PM   #6
jbenny
Addict
jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.
 
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
If indeed scans of the original are always available, then yes, this may not be an issue. However, that may be a big if. And what if the scans were available, but later were not?

And what of new content that is purely digital? How do you make reference to a particular passage in an ebook that is unambiguous? I guess you could say "chapter xx, paragraph yy, sentence zz", but that is cumbersome.

It probably would be better to let the reader software/hardware handle this. However, I don't know of any readers that do. That's why I was wondering about ways to do it within the limits of existing ebook formats.

As ebooks become more prevalent and possibly replace p-books, some standard method of dealing with references should be available. Perhaps this is something that should be addressed in a future standard (epub or otherwise). In fact, the current focus of ebooks seems to be exclusively casual reading. If ebooks are to be used to replace textbooks and other scholarly works, then the current standards need improving. I know that currently PDF is used to solve some of these problems, but we all know the problems of using PDF on other than full-screen devices.
jbenny is offline   Reply With Quote
Old 11-06-2007, 03:36 PM   #7
bowerbird
Banned
bowerbird has been very, very naughtybowerbird has been very, very naughtybowerbird has been very, very naughty
 
Posts: 269
Karma: -273
Join Date: Sep 2006
Location: los angeles
i've responded in the original thread.

first here:
> http://www.mobileread.com/forums/sho...217#post112217

then here:
> http://www.mobileread.com/forums/sho...451#post112451

and just recently, here:
> http://www.mobileread.com/forums/sho...962#post112962

i see no reason for a new thread, and won't repeat my posts here:

the links above are one example of solving the problem
in a digital environment, of course. but they only work
because the target-file had "i.d." references coded into it.
without such referents, it's difficult to attack this task...

however, there's no reason browsers can't be improved
with a simple mechanism that let you _link_ to a page,
adding a "search phrase" which the browser acted upon.

that is, you could link to this page:
> http://z-m-l.com/go/myant/myantp111.html
but also append the search-phrase after it:
> http://z-m-l.com/go/myant/myantp111.html?sp="don't see how"
and have the browser:
(1) load the page, and
(2) execute the search,
(3) locate you right around an intended spot.

indeed, it might even be child's play for a web-browser
plug-in programmer to code a plug-in to do this now...

the advantage -- that you could link to any phrase
on any page, even if the author of that page hadn't
coded any i.d. references -- is huge, seems to me...

otherwise, your system depends on other people
having done the work you wish them to have done,
and that's never gonna prove to be a tenable solution.

-bowerbird
bowerbird is offline   Reply With Quote
Old 11-06-2007, 03:50 PM   #8
NatCh
Gizmologist
NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.
 
NatCh's Avatar
 
Posts: 11,605
Karma: 926222
Join Date: Jan 2006
Location: Republic of Texas Embassy at Jackson, TN
Device: Nook STGR
I thought about this issue some years ago, because my wife is a literary scholar, so I tend to consider that side of things, even though it doesn't impact me directly.

Going forward, if this e-book thing really does take off, some way of absolute reckoning within a text that isn't dependent on pages is going to have to emerge. For books where original or scanned files exist, page references will continue to work indefinately, but they may or may not, depending on the method, for things which are never published physically.

My thought is probably a paragraph or line numbering approach, either from the beginning or from chapters or whatever other type of sectioning makes sense, would work well enough, but it would need to be present and the same in all versions of the e-publication. Preferably, it could be toggled on and off in the reading software.

I think, however, that this is yet another example something that's dependent on a "standard" e-book format, and would need to be built into both files and viewing software in order to be at all viable.

Just my thoughts, salt to taste.
NatCh is offline   Reply With Quote
Old 11-06-2007, 03:59 PM   #9
jbenny
Addict
jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.
 
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
Natch, very good thoughts on this.

You are correct that this needs to be standardized in both the ebook format itself and the reading device or software. Yes, some way to toggle this information off and on would be very much desired, so as not to interfere with normal reading. I just don't see this happening without some additional functionallity built into the reader, however. This needs to be addressed by standards as well.

Edit: I hope that some of the folks at IDPF are listening in and taking notes.
jbenny is offline   Reply With Quote
Old 11-06-2007, 04:04 PM   #10
jbenny
Addict
jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.
 
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
Just to follow up the comments on standards - although things like this would be best addressed by some future standard, I am still curious as to ways that this might be dealt with today, within existing standards and using popular ebook formats.
jbenny is offline   Reply With Quote
Old 11-06-2007, 04:05 PM   #11
Patricia
Reader
Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.
 
Patricia's Avatar
 
Posts: 11,520
Karma: 2199070
Join Date: May 2007
Location: South Wales, UK
Device: Sony PRS-500, PRS-505, Asus EEEpc 4G
My experience is that while scientists often write short but pithy papers where section references are sufficient, things are different in the Humanities. Literature specialists have to refer to author, title and page and often want to use a particular scholarly edition. Philosophers do the same when referring to the work of contemporaries.

I teach a fair amount of Plato and Aristotle and find online texts are a problem.When referring to Plato, it is essential to use Stephanus numbers, which will identify any sentence in his entire oeuvre. These appear as marginal numbers and letters in most print versions in both English and Greek. But the numbers simply don't appear in the online versions of Plato (except for the Perseus Project version). So I can't recommend them to students and don't use online versions myself.

(This is why I've never uploaded a Plato dialogue: without the Stephanus numbers it is useless to me. But with them it is irritating to general readers.)
Patricia is offline   Reply With Quote
Old 11-06-2007, 04:09 PM   #12
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,820
Karma: 5006091
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
As a practical matter I often find that unambiguous identifiers in the source *text* itself work "well enough". For instance, "the paragraph after figure 3" or the "introductory text of section X".

The problem with line numbering is that the number of lines depends on screen size/font size with a reflowable format. Paragraph numbers might work though assuming the referenced document semantically identifies paragraphs.
kovidgoyal is offline   Reply With Quote
Old 11-06-2007, 04:20 PM   #13
jbenny
Addict
jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.
 
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
Quote:
Originally Posted by Patricia View Post
My experience is that while scientists often write short but pithy papers where section references are sufficient, things are different in the Humanities. Literature specialists have to refer to author, title and page and often want to use a particular scholarly edition. Philosophers do the same when referring to the work of contemporaries.

I teach a fair amount of Plato and Aristotle and find online texts are a problem.When referring to Plato, it is essential to use Stephanus numbers, which will identify any sentence in his entire oeuvre. These appear as marginal numbers and letters in most print versions in both English and Greek. But the numbers simply don't appear in the online versions of Plato (except for the Perseus Project version). So I can't recommend them to students and don't use online versions myself.

(This is why I've never uploaded a Plato dialogue: without the Stephanus numbers it is useless to me. But with them it is irritating to general readers.)
In works like this, where the numbers are expected to be present, there is no reason that the online versions couldn't include them. An example epub that I posted in another thread showed one way to do paragraph numbers in a separate column. Something similar could probably be done with the Stephanus numbers (I'm sure there are other ways to do this as well).

As for the numbers being irritating to the general reader, they could easily be toggled with some javascript and an additional stylesheet in a web browser. You just couldn't toggle them in an ebook version, as far as I can see.
jbenny is offline   Reply With Quote
Old 11-06-2007, 04:22 PM   #14
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 9,583
Karma: 5071748
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2
Quote:
Originally Posted by jbenny View Post
Your point about page numbers perhaps disappering in the future is well taken and very likely. However, for quite a while, we would still have to deal with page numbers and references to them.
Page numbers are not enough. You also need more data to use a book in a bibliography or other scholarly reference. Different editions have different numbering and even a hardback vs. paperback have different numbering. Often the scanned book data is not detailed enough to define these differences. This is why, in business, where references are needed the sections and sometimes even the paragraphs are numbered. Page numbers are not really enough.

Dale
DaleDe is offline   Reply With Quote
Old 11-06-2007, 04:30 PM   #15
jbenny
Addict
jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.jbenny has a complete set of Star Wars action figures.
 
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
Quote:
Originally Posted by kovidgoyal View Post
The problem with line numbering is that the number of lines depends on screen size/font size with a reflowable format. Paragraph numbers might work though assuming the referenced document semantically identifies paragraphs.
Yes, line numbering isn't going to work for ebooks, which are reflowable. Maybe paragraph numbers would be fine-grained enough?
jbenny is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Page numbers Fincary Astak EZReader 4 02-18-2010 03:06 PM
page numbers nenad Amazon Kindle 2 12-19-2009 09:01 AM
Professional and scholarly ebooks account for 75% of ebook market? anurag News 1 11-26-2009 12:40 PM
Page numbers, AGAIN orlincho Bookeen 92 08-19-2008 07:15 AM
Page numbers (again) Prospect Workshop 50 04-10-2008 02:19 AM


All times are GMT -4. The time now is 05:55 PM.


MobileRead.com is a privately owned, operated and funded community.