Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book General > General Discussions

Notices

View Poll Results: How important are page numbers in Kindle Books?
Very important - I tend to avoid those books and forget the author 16 8.56%
Nice to have - I use them if they are there 57 30.48%
Not important at all - get over yourself. 114 60.96%
Voters: 187. You may not vote on this poll

Reply
 
Thread Tools Search this Thread
Old 03-30-2016, 09:50 PM   #121
AnotherCat
....
AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.
 
Posts: 1,547
Karma: 18068960
Join Date: May 2012
Device: ....
Quote:
Originally Posted by Cinisajoy View Post
You misunderstood.
He literally put page number 145 on page 1. There was only the title and copyright pages before that. The book was less than 200 actual pages.
No, I did understand. That is why I said and it took you 10 seconds to read those 145 pages with nothing on them, or to work out that they do not exist at all. I covered both cases ;-).
AnotherCat is offline   Reply With Quote
Old 03-30-2016, 09:55 PM   #122
Cinisajoy
Just a Yellow Smiley.
Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.
 
Cinisajoy's Avatar
 
Posts: 19,161
Karma: 83862859
Join Date: Jul 2015
Location: Texas
Device: K4, K5, fire, kobo, galaxy
Quote:
Originally Posted by AnotherCat View Post
No, I did understand. That is why I said and it took you 10 seconds to read those 145 pages with nothing on them, or to work out that they do not exist at all. I covered both cases ;-).
I misread your post.
Cinisajoy is offline   Reply With Quote
Old 03-30-2016, 10:04 PM   #123
AnotherCat
....
AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.
 
Posts: 1,547
Karma: 18068960
Join Date: May 2012
Device: ....
Quote:
Originally Posted by Cinisajoy View Post
I misread your post.
No problem Cinisajoy, I enjoy your posts whether I misread them or not so you're allowed to misread mine.

I have wondered if it would be a good idea for some books to have no page numbers at all. For example, for me Bleak House comes to mind; many pages (however one wants to count them except if on a scroll), storyline moves at less than snail's pace, prose such that that on succeeding pages is not closely correlated, etc. Then I could rip out every second page and get through it much faster without knowing that half of it wasn't even there.
AnotherCat is offline   Reply With Quote
Old 03-30-2016, 11:32 PM   #124
Cinisajoy
Just a Yellow Smiley.
Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.
 
Cinisajoy's Avatar
 
Posts: 19,161
Karma: 83862859
Join Date: Jul 2015
Location: Texas
Device: K4, K5, fire, kobo, galaxy
Quote:
Originally Posted by AnotherCat View Post
No problem Cinisajoy, I enjoy your posts whether I misread them or not so you're allowed to misread mine.

I have wondered if it would be a good idea for some books to have no page numbers at all. For example, for me Bleak House comes to mind; many pages (however one wants to count them except if on a scroll), storyline moves at less than snail's pace, prose such that that on succeeding pages is not closely correlated, etc. Then I could rip out every second page and get through it much faster without knowing that half of it wasn't even there.
Atlas Shrugged is another one that comes to mind on page numbers. I am managing one chapter every few months.
I don't think you would miss much losing 3/4 of the pages.
Cinisajoy is offline   Reply With Quote
Old 03-31-2016, 04:47 AM   #125
MikeB1972
Gnu
MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.
 
Posts: 1,222
Karma: 15625359
Join Date: Jul 2009
Location: UK
Device: BeBook,JetBook Lite,PRS-300-350-505-650,+ran out of space to type
I'd have thought the biggest problem with referencing specific locations in ebooks would be updates. Revision numbers appear to be an afterthought so it is nigh on impossible to tell what, if any, changes have been made and you can't get an older version anyway, so if you reference an exact point in an ebook and changes are made then the reference is useless.
MikeB1972 is offline   Reply With Quote
Old 03-31-2016, 09:36 AM   #126
Cinisajoy
Just a Yellow Smiley.
Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.
 
Cinisajoy's Avatar
 
Posts: 19,161
Karma: 83862859
Join Date: Jul 2015
Location: Texas
Device: K4, K5, fire, kobo, galaxy
Quote:
Originally Posted by MikeB1972 View Post
I'd have thought the biggest problem with referencing specific locations in ebooks would be updates. Revision numbers appear to be an afterthought so it is nigh on impossible to tell what, if any, changes have been made and you can't get an older version anyway, so if you reference an exact point in an ebook and changes are made then the reference is useless.
Isn't it the same for paper editions?
If there is more than 1 edition.
The DSM comes to mind. There are at least 4 editions and all of them have significant changes. So referencing any specific behavior in one may be something totally different or absent in another. Which is why one needs to reference the specific book.
Any classic book is probably the same as there are many versions of them.
*Diagnostic and Statistical Manual for Mental Disorders.
Cinisajoy is offline   Reply With Quote
Old 03-31-2016, 11:21 AM   #127
MikeB1972
Gnu
MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.
 
Posts: 1,222
Karma: 15625359
Join Date: Jul 2009
Location: UK
Device: BeBook,JetBook Lite,PRS-300-350-505-650,+ran out of space to type
Quote:
Originally Posted by Cinisajoy View Post
Isn't it the same for paper editions?
If there is more than 1 edition.
The DSM comes to mind. There are at least 4 editions and all of them have significant changes. So referencing any specific behavior in one may be something totally different or absent in another. Which is why one needs to reference the specific book.
Any classic book is probably the same as there are many versions of them.
*Diagnostic and Statistical Manual for Mental Disorders.
Except with a paper edition you can track down an old version, that isn't possible with ebooks.
MikeB1972 is offline   Reply With Quote
Old 03-31-2016, 11:27 AM   #128
Cinisajoy
Just a Yellow Smiley.
Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.
 
Cinisajoy's Avatar
 
Posts: 19,161
Karma: 83862859
Join Date: Jul 2015
Location: Texas
Device: K4, K5, fire, kobo, galaxy
Quote:
Originally Posted by MikeB1972 View Post
Except with a paper edition you can track down an old version, that isn't possible with ebooks.
Not always lol.
Cinisajoy is offline   Reply With Quote
Old 03-31-2016, 08:48 PM   #129
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Hitch View Post
Now, of course--the sentence or item on the old Page 12 that may contain information that's of use to the new reader may now be 1, 2, 3 or more "page flips" away from the location of that ID number, because those "page numbers" are based on page "print" size--say, 5.5x8, or one of those.
This is a key point. Physical Page Numbers are fully dependant on the book being in that exact format (that exact page size + page margins + fonts + font size + [...]). If you change one of those variables, all of the page numbers get thrown off.

This also brings up the problem of actual text cross-references used in physical books. Some might be in this form:
  • "See Footnote 1 on page 252"
  • An index might say "312n" (a footnote on page 312) or "312n5" (Footnote 5 on page 312)

This sort of text makes absolutely ZERO sense in an ebook.

More neutral text that makes sense in physical + ebook would be something like:
  • "See Chapter 2, Footnote 20"
  • "See Section 1.1"
  • "See Footnote 2 in Section 1.1"

The above text works in Print, Large Print, EPUB, PDF, HTML, whatever.

Or let us rant about one of my favorites... footnotes. In a physical book, footnotes may be numbered per page (so restarting from #1 each page). In a digital/other version, this numbering system becomes impossibly unwieldy. You might have links to 10 different "Footnote #1" in Chapter 2!

This requires the people typesetting/creating the physical book to be mindful of future/alternate formats. (Currently many publishers still just stick with the physical page number of hardcover or the highway!)

But hopefully more awareness of this problem at least shifts that mindset to make the texts themselves more neutral/ebook friendly (such as numbering footnotes sequentially per chapter/book).

Quote:
Originally Posted by Hitch View Post
It's not an hour's work. It's more. (Plus checking, double-checking, etc.)

[...]

[...] or don't indulge in logic puzzles, or create indices, the "proper way to create indices in eBooks" sounds simple. It's easy to do poorly. It's not remotely simple to do correctly.
My gods... creating a proper index is EXPONENTIALLY more work than creating a simple dumb index (which already takes forever).

The "dumb index" (points right before the first word of that page), might get you a few ebook "page flips" away from the content. Depending on the density of the original physical pages, it could be ~400-800 words away.

As shalym mentioned, a more useful/thoroughly done "proper index" would point to the exact paragraph/sentence/word-level in which this reference occurred... but most people don't understand how... fracking... long... this... takes.

Creating an Index is so hard, and A HELL OF A LOT harder than it seems on the surface.

As an example: I am currently working on a "proper index" of a large non-fiction treatise (950 pages, ~400k words, Index: ~2.3k terms + ~5.1k links to page numbers). I already have the Index from the physical book (so "half the hard work is already done").

My current pace of converting this to a "proper index" is ~100 LINKS PER DAY. That means around 51 man-days of work (probably more).

Each and every link to a page number causes a cascade of extra work that you don't expect:

Easy Ones: These are easy: "Apologists, 48", ok, great, I reach page 48, and there is only 1 "apologists" on the entire page. Link the paragraph, problem solved!

These might take a few seconds to a minute.

Hard Ones: Hard ones are fracking HARD: "Ancestors, 3, 36, 145".

Great, I found the word "ancestors" in page 3, EASY. But wtf is this, I just read the entire page 36, and I don't see "ancestors" on the page.

You (as the converter) must now read/skim the ~400-800 words that constitute "page 36" to find what the Indexer ACTUALLY meant.

You have to look for all the related words: "ancestry" + "ancestor" + "ancestral". Maybe it just has an important sentence/paragraph that talks about ancestors indirectly (maybe talking about older relatives, or ancient civilizations).

Hard #2: "Keynes, John Maynard, 429, 464, 467, 468n, 546n, 737, 771, 785, 787, 846".

Keynes might be mentioned multiple times on a page. It just so happened to be because of the way the physical book was laid out (page margins, font, [...]), that Keynes was mentioned in the first + last paragraph on page 429, BUT, the middle paragraphs don't talk about him at all.

Where do I link? Do I link to that first paragraph? Do I link to the last paragraph too?

Keynes may also be mentioned quite a few times throughout the book on other pages, but it is just an unimportant/passing remark. This doesn't belong in the Index. In my searching/jumping around page numbers though, I STILL come across "Keynes" a hundred times, this takes time to sift through. (This is the problem of the Search/Concordance method + any sort of automated/semi-automated Indexing tools).

Hard #3: As Hitch mentioned, the same topic might be under multiple Index entries. This requires you to look through the Index and make sure all of THOSE links are the same as well. You don't want "Irish Setters" + "Setters, Irish" + "Sporting Dogs -> Setters -> Irish Setters" to point to different locations. This means you have to thoroughly (and I mean FRACKING THOROUGHLY) look through the Index when you are trying to create these things.

These hard ones might take 10+ minutes.

This book I am working on takes ~5 minutes on average per link (this takes into account double/triple-checking that the links are correct and a mistake was not made).

And this "proper index" I am working on is already "simpler". I already HAVE an index with page numbers on it, and I know the subject matter deeply (economics). Doing this as a business (at an ebook conversion house) would be IMPOSSIBLY expensive.

This rant didn't even tackle the subject of digitizing page RANGES in ebooks such as: "bilateral exchange, 794–796." Which paragraph should this entry end on? Well, I have to read those page 794-796 to find out! So you just think "Hey, it is just two lousy links/pages, how long could that take?" MINUTES TIMES HUNDREDS/THOUSANDS!!!

Long story short, Indexing is an art, and it is fracking HARD (and very specialized).

Quote:
Originally Posted by Hitch View Post
I confess I am replying to this prior to scoping out the Wikipedia article, but I've always thought that IF (and, hooooooooooo boy, is this a pipe dream in this regard) we could get Amazon, iBooks, B&N, etc., to all agree (or...maybe the IDPF???) that a PAGE = X characters, we could cut through a boatload of this s**t.
Don't get ahead of yourself! X characters of WHAT?
  • Of HTML?
    • What about whitespace? Should it change if I "prettify" the HTML file?
  • Of displayed characters?
    • What about hidden code/text that only shows in one edition but not the other?
      • Alt Text, Fleurons, stuff that shows in MOBI but not in KF8.
      • Math (SVG or images or MathML).
  • How would you treat things like the NCX? (In EPUB, you don't NEED an HTML TOC)
    • In future formats, there may be other different/easier files that make it easy to remove certain material (copyright page, title pages, Indexes, etc. etc.)
  • What if a future format has text generated on the fly?
    • For example, maybe in the future you just feed it an ISBN/DOI and it will generate a citation in a given format for you.
  • What about poems? (You know, a whole book of those twenty word poems.)
    • All of a sudden people will complain about the "10 page book" (10 pages = ~5000 words = 250 poems * 20 words)!
  • What about Front/Back Matter? (Typically Front Matter is in Roman Numeral numbering).
    • What happens when you move the "Front Matter" to the back of the ebook (such as the TOC?).
    • Should ebooks have similar alternate numbering?

Quote:
Originally Posted by HarryT View Post
The standard used for academic referencing by pretty much everyone is called the "Harvard Referencing System", and that does require page numbers.
There are QUITE a few other citation styles:

https://en.wikipedia.org/wiki/Citation#Styles

While Harvard is one of the more popular ones, it depends mostly on which fields you are in. There is also:
  • ACS (typically used in Chemistry)
  • AMS (Math)
  • APA (Psychology)
  • ASA (Sociology)
  • Bluebook (Law)
  • Chicago
  • IEEE (Engineering, Programming, Physics)
  • MLA
  • Oxford
  • Turabian
  • Vancouver
  • [...]

Quote:
Originally Posted by HarryT View Post
If you look up the appropriate referencing guide [...] For books, though, it always involves a page number (or numbers), to return to the original topic.
Not necessarily... there IS a reason why they introduced rules on handling "websites" + "ebooks" + all the other digital resources besides "books". :P

If you point to a website, there is no fracking page number. I would say an ebook is much closer to all the other digital formats (website) than physical book.

Side Note: Also, a much more intelligent solution for generating bibliographies is with a database of information which gets fed into a template (which outputs the specific Citation Style you are using). You feed the tool information such as (Author, Title, Year, Publisher, ISBN, [...]), you tell it what type it is (Book, Journal, Website, [...]), and the tool generates the proper format for you.

This is the purpose of things like BibLaTeX or using things such as Wikipedia Citation Templates:

https://en.wikipedia.org/wiki/Wikipe...tion_templates

Quote:
Originally Posted by HarryT View Post
This is of course not an issue for the overwhelming majority of referencing, given that references are generally to non-fiction sources which are normally only published by a single publisher, so the issue of multiple alternative editions rarely arises.
Even this I would say is changing.

Even in the academic world, more books are coming out in multiple forms:
  • Print versions
    • Hardcover + Paperback + Large Print
  • Digital versions
    • EPUB/MOBI
    • HTML versions
      • These potentially might have a lot more material than the print versions (video, audio, computer generated examples (think randomly generated math problems or graphs)).
    • PDFs
      • Most likely matches one of the Print Editions, but not necessarily (see HTML version above). Perhaps there might be more interaction in the PDF, or annotations, etc. etc.

Sure, you can have many who agree that "the hardcover print edition is where the page numbers come from". Sure, back during the stone ages when you only had Print/Large Print, or a Hardcover + Softcover, and you could insist that the pages = the hardcover, but sticking to those physical page numbers makes absolutely NO SENSE when you have multiple vastly differing digital formats.

And then that is just talking books. You may have something like an article that is standalone (PDF), reprinted on a site (HTML), plus the same article reproduced in a journal (different page sizes, margins, fonts, double-column, etc. etc.). Which page numbers do you INSIST on shoving onto the HTML version, the standalone's page numbers? The journal's? Which journal (the most prestigious?)?

And then this doesn't touch the purely digital texts (never physically printed, such as many self-published books).

Last edited by Tex2002ans; 03-31-2016 at 09:36 PM.
Tex2002ans is offline   Reply With Quote
Old 03-31-2016, 08:53 PM   #130
Cinisajoy
Just a Yellow Smiley.
Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.
 
Cinisajoy's Avatar
 
Posts: 19,161
Karma: 83862859
Join Date: Jul 2015
Location: Texas
Device: K4, K5, fire, kobo, galaxy
Quote:
Originally Posted by Tex2002ans View Post
This is a key point. Physical Page Numbers are fully dependant on the book being in that exact format (that exact page size + page margins + fonts + font size + [...]). If you change one of those variables, all of the page numbers get thrown off.

This also brings up the problem of actual text cross-references used in physical books. Some might be in this form:
  • "See Footnote 1 on page 252"
  • An index might say "312n" (a footnote on page 312) or "312n5" (Footnote 5 on page 312)

This sort of text makes absolutely ZERO sense in an ebook.

More neutral text that makes sense in physical + ebook would be something like:
  • "See Chapter 2, Footnote 20"
  • "See Section 1.1"
  • "See Footnote 2 in Section 1.1"

Or let us rant about one of my favorites... footnotes. In a physical book, footnotes may be numbered per page (so restarting from #1 each page). In a digital/other version, this numbering system becomes impossibly unwieldy. You might have links to 10 different "Footnote #1" in Chapter 2!

This requires the people typesetting/creating the physical book to be mindful of future/alternate formats. (Currently many publishers still just stick with the physical page number of hardcover or the highway!) But hopefully more awareness of this problem at least shifts that mindset to make the texts themselves more neutral/ebook friendly (such as numbering footnotes sequentially per chapter/book).



My gods... creating a proper index is EXPONENTIALLY more work than creating a simple dumb index (which already takes forever).

The "dumb index" (points right before the first word of that page), might get you a few ebook "page flips" away from the content. Depending on the density of the original physical pages, it could be ~400-800 words away.

As shalym mentioned, a more useful/thoroughly done "proper index" would point to the exact paragraph/sentence/word-level in which this reference occurred... but most people don't understand how... fracking... long... this... takes.

Creating an Index is so hard, and A HELL OF A LOT harder than it seems on the surface.

As an example: I am currently working on a "proper index" of a large non-fiction treatise (950 pages, ~400k words, Index: ~2.3k terms + ~5.1k links to page numbers). I already have the Index from the physical book (so "half the hard work is already done"). My current pace of converting this to a "proper index" is ~100 LINKS PER DAY. That means around 51 man-days of work (probably more).

Each and every link to a page number causes a cascade of extra work that you don't expect:

Easy Ones: These are easy: "Apologists, 48", ok, great, I reach page 48, and there is only 1 "apologists" on the entire page. Link the paragraph, problem solved!

These might take a few seconds to a minute.

Hard Ones: Hard ones are fracking HARD: "Ancestors, 3, 36, 145".

Great, I found the word "ancestors" in page 3, EASY. But wtf is this, I just read the entire page 36, and I don't see "ancestors" on the page.

You (as the converter) must now read/skim the ~400-800 words that constitute "page 36" to find what the Indexer ACTUALLY meant.

You have to look for all the related words: "ancestry" + "ancestor" + "ancestral". Maybe it just has an important sentence/paragraph that talks about ancestors indirectly (maybe talking about older relatives, or ancient civilizations).

Hard #2: "Keynes, John Maynard, 429, 464, 467, 468n, 546n, 737, 771, 785, 787, 846".

Keynes might be mentioned multiple times on a page. It just so happened to be because of the way the physical book was laid out (page margins, font, [...]), that Keynes was mentioned in the first + last paragraph on page 429, BUT, the middle paragraphs don't talk about him at all.

Where do I link? Do I link to that first paragraph? Do I link to the last paragraph too?

Keynes may also be mentioned quite a few times throughout the book on other pages, but it is just an unimportant/passing remark. This doesn't belong in the Index. In my searching/jumping around page numbers though, I STILL come across "Keynes" a hundred times, this takes time to sift through. (This is the problem of the Search/Concordance method + any sort of automated/semi-automated Indexing tools).

Hard #3: As Hitch mentioned, the same topic might be under multiple Index entries. This requires you to look through the Index and make sure all of THOSE links are the same as well. You don't want "Irish Setters" + "Setters, Irish" + "Sporting Dogs -> Setters -> Irish Setters" to point to different locations. This means you have to thoroughly (and I mean FRACKING THOROUGHLY) look through the Index when you are trying to create these things.

These hard ones take a minute+.

This book I am working on takes ~5 minutes on average per link (this takes into account double/triple-checking that the links are correct and a mistake was not made).

And this "proper index" I am working on is already "simpler". I already HAVE an index with page numbers on it, and I know the subject matter deeply (economics). Doing this as a business (at an ebook conversion house) would be IMPOSSIBLY expensive.

This rant didn't even tackle the subject of digitizing page RANGES in ebooks such as: "bilateral exchange, 794–796." Which paragraph should this entry end on? Well, I have to read those page 794-796 to find out! So you just think "Hey, it is just two lousy links/pages, how long could that take?" MINUTES!!!

Long story short, Indexing is an art, and it is fracking HARD (and very specialized).



Don't get ahead of yourself! X characters of WHAT?
  • Of HTML?
    • What about whitespace? Should it change if I "prettify" the HTML file?
  • Of displayed characters?
    • What about hidden code/text that only shows in one edition but not the other?
      • Alt Text, Fleurons, stuff that shows in MOBI but not in KF8.
      • Math (SVG or images or MathML).
  • How would you treat things like the NCX? (In EPUB, you don't NEED an HTML TOC)?
    • In future formats, there may be other different/easier files that make it easy to remove certain material (copyright page, title pages, Indexes, etc. etc.)
  • What if a future format has text generated on the fly?
    • For example, maybe in the future you just feed it an ISBN/DOI and it will generate a citation in a given format for you.
  • What about poems? (You know, a whole book of those twenty word poems.)
    • All of a sudden people will complain about the "10 page book" (10 pages = ~5000 words = 250 poems * 20 words)!
  • What about Front/Back Matter? (Typically Front Matter is in Roman Numeral numbering).
    • What happens when you move the "Front Matter" to the back of the ebook (such as the TOC?).
    • Should ebooks have similar alternate numbering?



There are QUITE a few other citation styles:

https://en.wikipedia.org/wiki/Citation#Styles

While Harvard is one of the more popular ones, it depends mostly on which fields you are in. There is also ACS (typically used in Chemistry), AMS (Math), APA (Psychology), ASA (Sociology), Bluebook (Law), Chicago, IEEE (Engineering, Programming, Physics), MLA, Oxford, Turabian, Vancouver, [...].



Not necessarily... there IS a reason why they introduced rules on handling "websites" + "ebooks" + all the other digital resources besides "books". :P

If you point to a website, there is no fracking page number. I would say an ebook is much closer to all the other digital formats (website) than physical book.

Side Note: Also, a much more intelligent solution for generating bibliographies is with a database of information which gets fed into a template (which outputs the specific Citation Style you are using). You feed the tool information such as (Author, Title, Year, Publisher, ISBN, [...]), you tell it what type it is (Book, Journal, Website, [...]), and the tool generates the proper format for you.

This is the purpose of things like BibLaTeX or using things such as Wikipedia Citation Templates:

https://en.wikipedia.org/wiki/Wikipe...tion_templates



Even this I would say is changing.

Even in the academic world, more books are coming out in multiple forms:
  • Print versions
    • Hardcover + Paperback + Large Print
  • Digital versions
    • EPUB/MOBI
    • HTML versions
      • These potentially might have a lot more material than the print versions (video, audio, computer generated examples (think randomly generated math problems or graphs)).
    • PDFs
      • Most likely matches one of the Print Editions, but not necessarily (see HTML version above). Perhaps there might be more interaction in the PDF, or annotations, etc. etc.

Sure, you can have many who agree that "the hardcover print edition is where the page numbers come from". Sure, back during the stone ages when you only had Print/Large Print, or a Hardcover + Softcover, and you could insist that the pages = the hardcover, but sticking to those physical page numbers makes absolutely NO SENSE when you have multiple vastly differing digital formats.

And then that is just talking books. You may have something like an article that is standalone (PDF), reprinted on a site (HTML), plus the same article reproduced in a journal (different page sizes, margins, fonts, double-column, etc. etc.). Which page numbers do you INSIST on shoving onto the HTML version, the standalone's page numbers? The journal's? Which journal (the most prestigious?)?

And then this doesn't touch the purely digital texts (never physically printed, such as many self-published books).
Can I just say I luv you for your entire post?
Cinisajoy is offline   Reply With Quote
Old 03-31-2016, 09:36 PM   #131
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Cinisajoy View Post
Can I just say I luv you for your entire post?
Tex2002ans is offline   Reply With Quote
Old 04-01-2016, 01:21 AM   #132
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
@Tex2002ans,

As always, you say it much better than I ever could (well, that is why you do this professionally...)


I just stick with "meh, relics of a legacy format" and refuse to even think about indexes.
eschwartz is offline   Reply With Quote
Old 04-01-2016, 02:34 AM   #133
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,503
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by eschwartz View Post
@Tex2002ans,

As always, you say it much better than I ever could (well, that is why you do this professionally...)


I just stick with "meh, relics of a legacy format" and refuse to even think about indexes.
I'm with Cins and eschwartz on this. You said and explained it better than I. Of course, I seem to recall that you're neck-deep in indexing woes, as we speak, are you not?

You have my sympathies. I hope you get paid for those damn hours! (Trust me: I think I've had a handful of clients over the years that were willing to pay for the changeover from the static index to the HTML version.)

Hitch
Hitch is offline   Reply With Quote
Old 04-01-2016, 02:58 AM   #134
Gail Ceylon
Junior Member
Gail Ceylon began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Apr 2016
Device: none
Post

As an ebook author [promotional text deleted - MODERATOR] I would have loved to have included page numbers as I think it would have been easier for readers. However, as I published through Smashwords they specifically advise against including page numbers as they don't always get formatted correctly. Having thought about it some more I don't think page numbers are that important if you have other ways of book marking relevant pages.

Last edited by Dr. Drib; 04-01-2016 at 07:28 AM.
Gail Ceylon is offline   Reply With Quote
Old 04-01-2016, 01:37 PM   #135
detayls
Addict
detayls ought to be getting tired of karma fortunes by now.detayls ought to be getting tired of karma fortunes by now.detayls ought to be getting tired of karma fortunes by now.detayls ought to be getting tired of karma fortunes by now.detayls ought to be getting tired of karma fortunes by now.detayls ought to be getting tired of karma fortunes by now.detayls ought to be getting tired of karma fortunes by now.detayls ought to be getting tired of karma fortunes by now.detayls ought to be getting tired of karma fortunes by now.detayls ought to be getting tired of karma fortunes by now.detayls ought to be getting tired of karma fortunes by now.
 
detayls's Avatar
 
Posts: 235
Karma: 1355374
Join Date: Aug 2007
Device: iPad, iPhone, Kindle Voyage
Unhappy Sorry...

Quote:
Originally Posted by HarryT View Post
This is not "news". I am baffled by what makes the original poster think that his opinion about Kindle page numbers does constitute "news". Moved to the "General Discussions" forum.
Sorry. To be honest I thought I was posting in the Kindle forum. My bad.
detayls is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Page numbers gone in AZW3 books Katsunami Calibre 1 09-03-2013 11:25 AM
Kindle (AZW3/MOBI) ebooks with "real page numbers" to PDF with same page numbers? abvgd Conversion 2 05-24-2013 01:24 PM
Adding Page Numbers to books on Kindle? jimwoods Calibre 1 02-17-2013 04:06 AM
Glo Page Numbers on Side-loaded books Davidsc Kobo Reader 8 02-09-2013 07:00 PM
page numbers in ALL kindle books? oecherprinte Amazon Kindle 8 09-30-2011 11:23 AM


All times are GMT -4. The time now is 02:52 PM.


MobileRead.com is a privately owned, operated and funded community.