Hi all,
I am involved in a project that involves exporting to epub 15-20 books of about 500 pages each.
All these books have large and essential indexes that need to make it into the epub, preferably linked.
As the indexes were created by hand, and outside of indesign, that program does not know how to make them active. It would be really nice if indesign knew how to link page numbers, which it self-creates, to a list of page numbers in the same document. Sadly, it does not.
As one of the source files was not in indesign, but in QuarkXpress, and in a version which I do not have, I have experimented with loading the PDF into Acrobat and exporting it to a Word file. This gives me the page numbers, which are carried with all the other header and footer stuff into the word file.
I then did the following:
- Exporting the Word document to ePub with OpenOffice Writer's writer2ePub plugin. This yields an epub with for each page an xhtml document.
- In Sigil, regexing the page numbers of the book to a self-closing <a> tag with the proper id - like pxxx - and moving them into the topmost paragraph tag of the page.
- Then merging all the pages that are not the index into one xhtml file. Doing the same for the pages that are part of the index.
- Iterating through the index file, using regexes to find page numbers and link them to the proper anchor inside the book document. There are ins and outs to this, that I will gloss over here.
- Finally, splitting the book up into its logical chapters - usually one xhtml file per chapter, same with the notes etc. Thankfully, Sigil knows how to manage the links, once xhtml files are broken up.
- Sadly, bc of the PDF export, there is a lot of cleaning still to do, as hyphenation is not understood by acrobat's PDF reading and exporting system. Also, there will be headers and/or footers, lots of unwanted hard and soft returns, whitespace, and no styles except (hopefully) italics, bold, super and subscripts. Also, the index now only links to the page, and not to the proper paragraph or sentence, which might be possible in an ePub, but is now beyond the pale for this project.
It is definitely a half-way solution, but one that can be simplified to some extent within Sigil, by using the Saved Searches functionality.
The whole index thing took me about two hours, from start to finish, for a 4k entry index. One book down, fourteen (or more) to go.
I am posting this as a hack, but in this community it is likely there are minds that can spot weaknesses in my approach. Feel free to shoot holes, and help me out for the remainder of the series.
------
As an edit: for those who come after, looking for a solution, there are valuable comments in the thread.
Outstanding ones from my perspective have been the realization that there are indesign plugins that export the page numbers into epub, here:
Quote:
Originally Posted by BeckyEbook
|
Saving the best for last, there is actually a script that links "dead" indexes to their indesign page numbers, and exports them to epub. It's called LiveIndex, and can be bought here. (I have no relationship to this developer.)
https://www.id-extras.com/products/liveindex/