MobileRead Forums - View Single Post

Tex2002ans · 11-06-2017, 09:30 PM

Quote:

Originally Posted by Notjohn

Alas, I for one self-publisher find myself revising the print edition more often than formerly in order to keep it in sync with the ebook. And I don't change the metadata or the ISBN, though I do change the year of publication on the title page and in the book description. So I am not helping things.

Yeah, I would most likely just add a little version # or "Last Updated" line in the copyright page.

For example:

Code:

© 2014 by Notjohn
Latest Ebook Revision: November 2017

Sort of like when Publishers might add what # printing it is:

Code:

© 2014 by Notjohn
1st Edition, November 2014
2nd Edition, January 2016
8th Printing, November 2017

Quote:

Originally Posted by doubleshuffle

Sorry Jon, but you're mistaken about the accuracy of ADE page numbers. As has been explained before in this thread, edits to a book, even small ones, affect ADE page numbers.

For instance, when I build a book, it is easiest for my workflow to have large html files with many chapters/poems in the early stages. But at a certain point I split them into one file for each chapter/poem. The ADE page count rises considerably in the process (by hundreds of pages in a collection of many poems).

Big edits, big changes, yes, but the point is that edits do change the numbering; small edits might only make small changes to the page numbers, but even small changes will render the numbering inaccurate.

I agree about the separate HTML chapters per poem throwing off the ADE Page Numbers. I assume the algorithm is "1024 compressed bytes OR new file."

I also put together a proof of concept EPUB to show how easy Byte Methods are thrown off. I just grabbed the first EPUB I saw out of the MobileRead EPUB section:

AlexBell's conversion of "Turgenev, Ivan: The Diary of a Superfluous Man and Other Stories. v1 4 Nov 2017"

* * * (All three tests are attached at the end of this post) * * *

Test #1

I don't believe AlexBell uses Sigil to build the EPUB (just open it up and take a look at the structure).

If you look at his HTML, he also uses the HTML character codes, … + – + ’ + [...]:

Test #2

If you open the EPUB in Sigil and just save it.

Sigil moves things into its own folder structure, and the largest change would be character codes to their actual unicode characters: … + – + ’ + [...]:

Test #3

I then took all of Chapter 1's HTML and copied/pasted it below. Then I threw the entire duplicate Chapter 1 into a giant HTML comment.

Test Pages

ADE Pages

Test #1 = 145
Test #2 = 151
Test #3 = 194

Amazon Location #s

Test #1 = 2835
Test #2 = 2706
Test #3 = 2706

Summary

ADE

Test #1 -> Test #2:

The Unicode characters seem to compress slightly differently:

The Test #1 filesizes are larger, but smaller compressed.
The Test #2 filesizes are smaller, but the tiniest bit larger while compressed.

I think this is where the discrepancy of 6 ADE Pages comes from.

Test #2 -> Test #3:

ADE's algorithm counts HTML Comments... so the ADE Pages between #2 and #3 are COMPLETELY thrown off.

And it was VERY odd:

I was assuming the test would go like this:

Test #2. Start in Chapter 1, and go all the way until you hit the "March 29" section.
Test #3. Do the same thing. The ADE Page #s SHOULD be the same. I was expecting a GIANT jump only when you go from Chapter 1->Chapter 2 (the location of that giant HTML comment).

Instead:

Test #2, "March 29" was located on ADE Page 37.
Test #3, "March 29" was located on ADE Page 70.

(And Test #1, "March 29" was located on ADE Page 35.)

Kindles

Test #1 -> Test #2:

The Location #s changed by about 100, just from HTML codes -> Unicode characters.

Test #2 + #3:

These are the same Location #s on Kindles!

Note: MOBI/KindleGen throws away HTML comments (Could be bad or good depending on how you look at it. I personally use HTML comments often when working with images of formulas. I find them to be VERY important. One of the other reasons why I prefer EPUB.)

Quote:

Originally Posted by Hitch

I don't mind admitting that that's weird. You'd think that the way that the "pages" are calculated, in ADE, woudl not change that much, by the simple expedient of splitting the pages, (files), unless HTML characters--and not simply text characters--are being counted, e.g., what's in the head.

I also recall ADE Page Numbers being thrown off because of EPUBs that weren't fully compressed.

We are so used to Sigil/Calibre (I believe they just use Maximum ZIP compression).

But I recall when I first started working on EPUBs, a lot of the old ones I was cleaning up were ZIPed with zero/little compression.

I remember ADE Page numbers varying wildly just because of that minor change in packaging. I would save in Sigil and be scratching my head (before I knew better and how you couldn't rely on ADE Pages!). :P

Quote:

Originally Posted by Hitch

(It would be easier if the IDPF would come up with something, anything, that would make citing more consistent, but as we've all discussed here, before, on any number of topics--so what? Even if they did, what's to say that anyone would follow it? We could probably count on Apple not to; B&N, who knows--arguably, they're getting out of the biz, and then..well.)

Me and Doitsu have been having more of our PM chats.

There was an "I Annotate" conference earlier this year (speeches can be found here):

https://www.youtube.com/user/hypths/videos

It seems as if the W3C's Web Annotation standard has been finalized. (See "I Annotate 2017 Day 1 W3C Standards for Web Annotation: Rob Sanderson")

I would probably say that would be one of the best bets in getting Annotation supported in/across browsers (and thus, trickle its way down to ereaders).

From what I could see, a potential proof-of-concept might be a Google Books-type situation, where you could have a given link send you to a very specific section of a PDF/HTML/EPUB(?) file, and have your highlights/notes on top of it.

And that stuff might be okay when using digital documents to cite other digital documents, but I still don't see that as a good way to cite physical <-> digital. (So you wouldn't be able to have the same text citation in your Print + ebook.)

Anyway, I'll be listening through all those I Annotate speeches over the coming week, and I'll be jotting down my own summaries of each speech. I could send you a copy if you are interested.