Incremental numbers instead of page numbers in the Index

NL77 · 03-19-2025, 09:58 AM

Hi,
I have several Indesign exports in which the authors’ names in the Index refer to page numbers.
Like this
First Author 18, 25.
Second Author 123, 259, 368. etc.

Code:

<p class="inp">First Author <a href="epi1.xhtml#idx104">18</a>, <a href="epi2.xhtml#idx624">23</a></p>
<p class="inp">Second Author <a href="epi1.xhtml#idx057">123</a>, <a href="epi1.xhtml#idx178">259</a>, <a href="epi1.xhtml#idx241">368</a></p>

I need to replace these page numbers in the e-book with ascending numbers.
Like this:
First Author 1, 2.
Second Author 1, 2, 3. etc.

Code:

<p class="inp">First Author <a href="epi1.xhtml#idx104">1</a>, <a href="epi2.xhtml#idx624">2</a></p>
<p class="inp">Second Autor <a href="epi1.xhtml#idx057">1</a>, <a href="epi1.xhtml#idx178">2</a>, <a href="epi1.xhtml#idx241">3</a></p>

I wrote a simple but working Python script which does the job:
It looks like this:

Code:

import re

i = 0

def IncrementalNumbers(m):
    global i
    i+=1
    return str(i)

PageNumbers = r'(\d+)(?=</a>)'

with open("index.xhtml", 'r') as fp, open("index_renumbered.xhtml","w") as out:
    # read only one line of the file and apply the transformations
    for line in fp:
        i = 0
        l = re.sub(PageNumbers, IncrementalNumbers, line)
        out.write(l)

It would be much faster if I could use this as a plugin, but unfortunately, I have never written a plugin before, and I don't have enough knowledge for it.
My first attempt was only half successful, because the plugin counts globally, not line by line.

Code:

import re
import sys
import sigil_bs4
from bs4 import BeautifulSoup

text_type = str

i = 0

def IncrementalNumbers(m):
    global i
    i+=1
    return str(i)

PageNumbers = r'(\d+)(?=</a>)'
#RefSymbol = '←'

def run(bk):
    for (id, href) in bk.text_iter():
        print('Start %s:' % href)
        html = bk.readfile(id)
        soup = sigil_bs4.BeautifulSoup(html)

    html_orig = html

    html = re.sub(PageNumbers, IncrementalNumbers, html)


    if not html == html_orig:
        print("Modified File --> ", id)
        bk.writefile(id, html)

    return 0


def main():
    print("I reached main when I should not have\n")
    return -1

if __name__ == "__main__":
    sys.exit(main())

This is the result: Second Author 3, 4, 5 instead of Second Author 1, 2, 3

Code:

<p class="inp">First Author <a href="epi1.xhtml#idx104">1</a>, <a href="epi2.xhtml#idx624">2</a></p>
<p class="inp">Second Autor <a href="epi1.xhtml#idx057">1</a>, <a href="epi1.xhtml#idx178">2</a>, <a href="epi1.xhtml#idx241">3</a></p>

Can someone help me to define the lines in the plugin?
Thank you

Haudek · 03-19-2025, 11:36 AM

Try this.

Spoiler:

Look at line with "RESET HERE". Here we reset the counter and each paragraph is counted separately.

NL77 · 03-19-2025, 01:24 PM

Thank you, it works beautifully, I am very grateful.

I don‘t know how common this problem is, but I‘m working on a number of textbooks with indexes, and you‘ve saved me precious minutes.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
How to remove repeated incremental numbers in books	pinky62	Library Management	3	12-12-2022 03:28 PM
Index: replace generic links with page numbers	del.libro	PDF	1	02-28-2021 08:20 AM
Clara HD Doubt with page numbers in kepub index	Fenrag	Kobo Reader	7	05-15-2020 04:34 AM
Kindle (AZW3/MOBI) ebooks with "real page numbers" to PDF with same page numbers?	abvgd	Conversion	2	05-24-2013 01:24 PM
Is there a hack for displaying page numbers rather than location numbers?	nesler	Kindle Developer's Corner	16	02-15-2011 12:00 AM

03-19-2025, 01:24 PM	#3
NL77 Junior Member Posts: 2 Karma: 10 Join Date: Mar 2025 Device: android	Thank you, it works beautifully, I am very grateful. I don‘t know how common this problem is, but I‘m working on a number of textbooks with indexes, and you‘ve saved me precious minutes.