View Single Post
Old 04-12-2018, 08:00 AM   #13
gmw
cacoethes scribendi
gmw ought to be getting tired of karma fortunes by now.gmw ought to be getting tired of karma fortunes by now.gmw ought to be getting tired of karma fortunes by now.gmw ought to be getting tired of karma fortunes by now.gmw ought to be getting tired of karma fortunes by now.gmw ought to be getting tired of karma fortunes by now.gmw ought to be getting tired of karma fortunes by now.gmw ought to be getting tired of karma fortunes by now.gmw ought to be getting tired of karma fortunes by now.gmw ought to be getting tired of karma fortunes by now.gmw ought to be getting tired of karma fortunes by now.
 
gmw's Avatar
 
Posts: 5,818
Karma: 137770742
Join Date: Nov 2010
Location: Australia
Device: Kobo Aura One & H2Ov2, Sony PRS-650
If the idea of counting characters was correct then I believe you should count the spaces. However, even though various places on the 'net do talk about counting "1024 characters" and even explicitly "1024 Unicode characters", I suspect that that is not the algorithm used by ADE (if that is what you're interested in matching).

Adobe aren't advertising, but Safari Books Online is one of several references I've found that say:

When there is no explicit page-map ADE calculates a page as being 1024 bytes of the compressed resource stream.

This would seem to make very good sense because it allows ADE to calculate page counts directly off the index, without decompressing every resource to count characters (typically each chapter is stored as a separate resource).

Curiously the MobileRead Wiki says both "1024 Unicode characters" (under the Page-Map heading) and 1024 compressed bytes (under the Page Numbers heading).

The 1024 bytes of compressed stream seems to match with the few epubs I've checked, but I can't say that I've been comprehensive in my testing.

ETA: For example I have a copy of Ernest Hemingway's The Old Man and the Sea. It has 132753 Unicode characters counting spaces (would be 130 pages); it has 106731 Unicode characters not counting spaces (would be 105 pages); it's compressed streams are 413bytes, 849bytes and 154625bytes (would be 51 pages). My Sony reader, and my Kobo Aura One say 51 pages. Calibre viewer says 138 pages. So go with what you like best.

Last edited by gmw; 04-12-2018 at 08:24 AM.
gmw is offline   Reply With Quote