View Single Post
Old 12-04-2020, 05:54 PM   #21
Ryn
Connoisseur
Ryn began at the beginning.
 
Posts: 55
Karma: 10
Join Date: Feb 2012
Device: none
Quote:
Originally Posted by BeckyEbook View Post
I'm not sure if this is exactly what you need, but I will post a few links that may lead you to come up with your own solution.

http://epubsecrets.com/why-i-use-page-list-and-how.php
http://epubsecrets.com/page-list-all...e-doing-it.php

The link to the script is dead, so I'm listing it from web archive:
http://web.archive.org/web/201912181...orohikoscripts

You can write directly to Laura, but I have a feeling you'd better check out the "EPUB Accessibility Using InDesign" video tutorial (available from Lynda.com or Linkedin), which AFAIK includes the PageStaker and EPUBOgrify script.
The latter is not so important anyway, because it is a simple change that can be done in Sigil.
Thank you for digging up the web archive link for me. This might be just what I'm looking for.

Quote:
Originally Posted by KevinH
If you have pdf of printed version of book, you should be able to print to a postscript file and use python on that postscript file to extract the page numbers and the first n words and last n words on each page (where n is small say 3) and save that info to a file. Then use sed or some other stream editor with that info to insert the markers you want in each html file.

Some custom programming in python might be needed but should be reusable for future projects.
This might also be something I'd consider doing, seeing how big the project is, and how much I loathe working from PDFs. By postscript file, do you mean a text file?

I can see some potential problems with this, as the page numbers are on the bottom of the pages, and some pages are empty, which may confuse the issue, but that might be something I could prompt for.

Quote:
Originally Posted by phillipgessert
I have not tried this (and frankly even if it works it still sounds pretty miserable) but I wonder if you could work page-by-page unlocking whatever master page element includes the page number, and then use a plugin such as https://www.rorohiko.com/wordpress/i...ds/textstitch/ to auto-thread the page numbers into the document flow.
Or I could try my hand at programming an indesign plugin for this express purpose. How hard could it be to get a script to recognize the page numbers, and to cross-reference the indexed page numbers to the first word on the relevant page? Famous last words, I'm sure...

--
Food for thought here folks, thanks a lot!
Ryn is offline   Reply With Quote