Quote:
Originally Posted by jackie_w
Here are the results of my little experiment using a random single plain epub from my calibre library: - Create 4 "versions" of the same book:
Code:
book1.epub - the original plain epub (it's epub2 not epub3)
book2.kepub.epub - book1 format-shifted to kepub using KoboTouchExtended not a calibre full conversion
book3.epub - copy of book2 with file ext manually changed to .epub
book4.kepub.epub - copy of book1 with file ext manually changed to .kepub.epub
- Drag-drop all 4 into calibre and run Count Pages on all of them to get ADE page count and word count.
- Drag-drop all 4 onto my KA1. I deliberately avoided using calibre to do the transfer to avoid any auto file changes calibre might make in-transit.
Open all 4 on the KA1 to see total page count.
Books 1 & 3 open in the Adobe epub renderer.
Books 2 & 4 open in the Kobo kepub renderer.
Results:
Code:
<-Count Pages -> <--- Kobo entire book --->
File Filesize Pages Words Adobe Pages Kepub Pages
book1.epub 1614709 277 98574 277 -
book2.kepub.epub 1665258 321 98574 - 325
book3.epub 1665258 321 98574 321 -
book4.kepub.epub 1614709 277 98574 - 325
Conclusions: - Comparing books 1 & 3. The extra Kobo spans/divs inflate the ADE page count from 277 --> 321
- Comparing books 2 & 3. These are the same file with 2 different file extensions. ADE pages 321, kepub pages 325. Not the same value, so algorithms are different.
- Similarly, comparing books 1 & 4. The same file with 2 different file extensions. ADE pages 277, kepub pages 325. Not the same value, so algorithms are different.
- Comparing books 2 & 4. The "true" kepub (book2) and the "fake" kepub (book4) have exactly the same kepub page count, 325. The lack of Kobo spans/divs in the "fake" kepub does not affect the kepub page count, so I surmise filesizes are not a factor in calculating kepub page count. My guess is that the algorithm uses wordcount rather than filesizes.
If you're still reading ... thanks for staying awake and feel free to disagree!  I'll be interested to read any of your own test results.
|
Thanks for doing that so that I don't have to.
When I saw the 321 an 325 for book 2, I figured that was close enough for it to be rounding or something. The algorithm is to divide the compressed file sizes by 1024 (from memory) and then add that up. For a book with multiple chapters, a five page difference wouldn't be enough to worry my.
But, as the kepub pages for book 2 and 4 are the same, that suggests they are using something else. It's probably just the character count when the tags are stripped. But, I wonder if they are actually using the word count and an average word length factor. They calculate the word count for each chapter when the kepub is first opened. Using that would mean they don't need to reprocess each chapter later to get the full book size.