Quote:
Originally Posted by JSWolf
ADE page numbering is the one true page number for ePub.
|
Page-map and pagelist both give consistent page numbers across devices since they depend on a pre-built page number list generally based on a hard cover paperbook edition. One issue is that a page number can occur anywhere on the screen. At one point, calibre supported page-map in ePubs but that disappeared in 2009-2010 (going by my vague recollection).
There were other ePub synthetic page numbering schemes that worked or could work across devices. The key is that both the alternatives to the Adobe algorithm I am aware of mimic the Adobe SPN and calculate the pages in the ePub based on the number of synthetic pages in each text file and have no relationship to the pages seen when reading so a page number can occur anywhere on the screen.
For all of them, the base algorithm is similar to the one use by Adobe's algorithm:
- Determine a compressed byte length of each resource which is referenced in the spine, subtracting any known encryption overhead (IV size).
- Assume that there is a page for each 1024 bytes in each resource, rounding up to the nearest whole number of pages for each resource.
- To map page breaks into a resource, use the number of pages for the resource as determined in step 2, count the number of Unicode characters in the resource; distribute synthetic page breaks in the resource evenly between the characters by dividing the number of characters by the number of pages; if the number of characters don’t divide evenly among the pages, round the number of characters per page up and let the last “page” contain less characters than the rest.
The other two differed in the number of compressed characters per page (1000 and 1023), the allowance for encryption and compression overhead and for one of them, the ability to recognize images and B64 inline images. That one required unencrypted ePubs to work properly and was slower than molasses since it had to unzip each text file but it could also add a page-map file and page numbers internal to the ePub which looks something like <a id="Page_023"/> going off my rather vague recollection. Both those projects have long since passed into abandonware.