View Full Version : Indexes using an original Microsoft word


John123
02-22-2012, 04:43 AM
Hi All,

Has anyone had experience of producing a fully cross-referenced index in epub format from a word file?

Many thanks in advance.

John.

Toxaris
02-22-2012, 01:27 PM
No, there is currently no tool that does this. You have to do this manually. Be aware, an ePUB has no real pagenumbers, so you have to think on something else to reference to.

John123
02-24-2012, 05:13 AM
No, there is currently no tool that does this. You have to do this manually. Be aware, an ePUB has no real pagenumbers, so you have to think on something else to reference to.

Hi,

Thanks. This confirms what I was thinking.

As far as I can see the only way to have a meaningful index is to use the one from the printed book and create an anchor (is this the correct term), using the original page number, so that when the reader "clicks" the index page number it takes them to the start of where the original page was. Then they would have to scan the text for the reference they are seeking, in much the same way as it is done with the printed book.

My next question is how would you code the old page number into the text, so that it is unseen, since most will inevitably land in the middle of sentences, then link back to the index file?

I'm using an ascii editor to input the html before dropping this coded text into Sigil.

Many thanks,

John.

Doitsu
02-24-2012, 06:11 AM
My next question is how would you code the old page number into the text, so that it is unseen, since most will inevitably land in the middle of sentences, then link back to the index file?
You could simply add <a id="xxx"/> anchors, which can be placed even in the middle of a sentence and are invisible. Each index entry would then need to contain a hyperlink to the page number anchor that you inserted in the text.

If page numbers are very important to you, you could also implement Adobe's page map extension (http://wiki.mobileread.com/wiki/Adobe_Digital_Editions#Page-map).
However, page maps are only supported by ADE-compatible readers, and ePubs containing them will fail ePub validation.
I.e., if you plan to release your book commercially, you may want to avoid page-maps since they're not part of the ePub standard and many ePub aggregator sites will reject ePubs with page-maps.

John123
02-24-2012, 12:25 PM
You could simply add <a id="xxx"/> anchors, which can be placed even in the middle of a sentence and are invisible. Each index entry would then need to contain a hyperlink to the page number anchor that you inserted in the text.

If page numbers are very important to you, you could also implement Adobe's page map extension (http://wiki.mobileread.com/wiki/Adobe_Digital_Editions#Page-map).
However, page maps are only supported by ADE-compatible readers, and ePubs containing them will fail ePub validation.
I.e., if you plan to release your book commercially, you may want to avoid page-maps since they're not part of the ePub standard and many ePub aggregator sites will reject ePubs with page-maps.


Hi,

Thanks for the reply. I think the <a id="xxx"/> would work. On reflection, though, how could this link back to the index once the reader has "clicked" through to the page reference? Since there is now no visible reference in the text.

I have attached an .epub file showing a couple of chapters with an index taken from a previously published work. I have permission to do this. The note references work well and cross-ref. each other. However, doing this with the above proposition (ie, using the printed book index), shows nothing visible to "click" and, therefore, can't get the reader back to the electronic index easily.

In the .epub file I have shown the printed page numbers as:
<a id="xxx"></a>, where xxx is a three digit figure starting 001.

Have I grasped a hot coal, with a problem that's never been addressed in ebook's? Since I can't find anything sensible on WWW.

Thanks,

John.

Toxaris
02-24-2012, 12:53 PM
Well, most readers have a search option, making the index more or less obsolete.

You could also make referenced to the word where the index is referencing to.

Doitsu
02-24-2012, 02:35 PM
Thanks for the reply. I think the <a id="xxx"/> would work. On reflection, though, how could this link back to the index once the reader has "clicked" through to the page reference?

I don't think that there is a good solution for this. That's why you'll hardly find indexes in non-fiction ebooks. And those who have them often often contain useless straight to ebook copies with non-clickable page numbers.

BTW, many hardware ebook readers also have a hardware Back button that'll take the reader back to previous location.
I.e. you might not have to provide an explicit linking mechanism back to the list of index entries.

Of course, you could add a link that would take the reader back to the original index entry. (I.e. you'd have two <a> anchors in a row: one index link target and one index list href.)
But each page number could only have one of those links and, IMHO, adding visible page numbers in the middle of the text would somehow disturb the reading flow and moreover page numbers might be mistaken for end-notes.
IMHO, hidden page numbers are a better solution if you decide to implement an index.

Maybe you could convince the author and/or publisher to omit the index in the ebook edition altogether because of these technical issues. After all, an index is primarily only useful for users of a print edition, since ebook users could easily search for any word using search function of their ebook readers.

John123
02-25-2012, 05:36 AM
I don't think that there is a good solution for this. That's why you'll hardly find indexes in non-fiction ebooks. And those who have them often often contain useless straight to ebook copies with non-clickable page numbers.

BTW, many hardware ebook readers also have a hardware Back button that'll take the reader back to previous location.
I.e. you might not have to provide an explicit linking mechanism back to the list of index entries.

Of course, you could add a link that would take the reader back to the original index entry. (I.e. you'd have two <a> anchors in a row: one index link target and one index list href.)
But each page number could only have one of those links and, IMHO, adding visible page numbers in the middle of the text would somehow disturb the reading flow and moreover page numbers might be mistaken for end-notes.
IMHO, hidden page numbers are a better solution if you decide to implement an index.

Maybe you could convince the author and/or publisher to omit the index in the ebook edition altogether because of these technical issues. After all, an index is primarily only useful for users of a print edition, since ebook users could easily search for any word using search function of their ebook readers.

Hi,

Thanks for your reply. I don't think there is going to be an elegant way of doing this.

I agree with you, we could dispense with an index altogether, since the text is searchable, unfortunately, academics use indexes extensively as a sort of glorified contents page to get to the area of interest and may not, necessarily, know what they are looking for. A good index gives a sort of overview of the text and the people, places and themes contained therein.

In the meantime, I think the best way to crack this is to give both options, ie, searchable text from the keywords they see in the index and also tag the original page number and, hopefully, using the "back" button function to get back to the index if necessary.

In this case how would I implement the <a id="xxx"/> tag within an <a href ...> tag?

Many thanks,

John.

Doitsu
02-25-2012, 08:31 AM
In this case how would I implement the <a id="xxx"/> tag within an <a href ...> tag?

You could simply have two <a> statements in a row:

<a id="p001"/><a href="../Text/Index.xhtml">Back to index</a>

The first one would serve as a target marker and the second one could be a link to anywhere else in the book.

BTW, I forgot to mention that the link target id string must not start with a number. I.e. you'd have to prefix page numbers with a letter. E.g. a "p."

Please find attached an updated version that includes some hyperlinks from the "Disarming" index entry to the actual pages.

DaleDe
02-25-2012, 12:51 PM
With regard to the back link I would suggest that the text in the book itself be used as a back link. For example if the index entry was or Earasaid then the entry in the book itself would be <a id="Index001" href="../Text/Index.html#earasaid">Earasaid</a>

The index entry would refer to the id Index001 while the back link would be visible via the highlighted word Earasaid in the text. This does two things. The text being highlighted would indicate an Index entry which would make it easier to find if you click on the entry in the index and when you see it in the text you would know it was indexed and could then follow the link to see if there are more references in the text for that word. (note the double use of the <a to provide both the id for the index link and the back link to the index.)

Dale

John123
02-28-2012, 12:59 PM
Hi,

Thank you all very much for taking the time to reply. It was all very informative.

I have now sent off samples, using your advice, to the publishers and await their appraisal.

My initial thought was by having the indexer involved at the start, ie, with the original Word file, prior to typesetting, and marking this file up for indexing. This would have the effect of having the keywords highlighted with some sort of code that could be used in a search and replace routine. The printed index could be compiled from this mark-up then, using these codes, the href's could have been inserted.

However, this would drastically change the way in which indexes are compiled and the way indexers work. Maybe that's what needs to happen.

Cheers,

John.

PS: As an aside, Dale, my sister and brother-in-law live in Grass Valley on Rattlesnake Road. You may well have rubbed shoulders with them! Small world, eh? :o

DaleDe
02-28-2012, 02:07 PM
It is a small world. Actually if you build an index in word and then save it as html you might find it is a lot closer than you think. I drive Rattelesnake road several times a week to get to the Church I go to in Cedar Ridge.

Dale

John123
02-28-2012, 02:51 PM
Small world indeed! My brother-in-law is a very keen Christian and church-goer his name is Steven Butcher, his son Cameron and wife, Shelley, also go to church regularly. Not sure if it's Cedar Ridge though. My sister-in-law is not quite so enthusiastic. Wouldn't it be amazing if you knew them!

John.