View Single Post
Old 08-25-2010, 05:27 PM   #1
jacktanner
Enthusiast
jacktanner has learned how to buy an e-book online
 
Posts: 28
Karma: 94
Join Date: Jul 2010
Device: none
page numbers in pdf

I wonder if it'd be possible to be clever about page number metadata for PDFs. My guess is it's very hard work and very brittle, but I figured I'd ask.

For some books, there is roman-numeral numbered front matter (table of contents, preface, etc.) and then arabic-numeral numbered content. For example,

http://www.springer.com/statistics/s...-1-4419-0741-7

shows

1st Edition., 2010, XIV, 313 p., Hardcover

That means the first 14 pages are front matter and the next 313 are content. If this metadata is embedded in a PDF, then a Kindle knows that "go to page 31" means "go to page 31 of the arabic-numeral pages". If this is not part of the PDF, it treats page 31 as "31 pages from the beginning of the PDF", which is not as useful.

So, the question is:
- can the page number metadata be retrieved by a plugin?
- can the metadata be embedded in a PDF by a plugin?

One catch is that a PDF may not contain the same number of pages as is listed in a bibliographic database (e.g., it may also contain an embedded cover as an extra page).
jacktanner is offline   Reply With Quote