Hi,
I have several Indesign exports in which the authors’ names in the Index refer to page numbers.
Like this
First Author
18, 25.
Second Author
123, 259, 368. etc.
Code:
<p class="inp">First Author <a href="epi1.xhtml#idx104">18</a>, <a href="epi2.xhtml#idx624">23</a></p>
<p class="inp">Second Author <a href="epi1.xhtml#idx057">123</a>, <a href="epi1.xhtml#idx178">259</a>, <a href="epi1.xhtml#idx241">368</a></p>
I need to replace these page numbers in the e-book with ascending numbers.
Like this:
First Author
1, 2.
Second Author
1, 2, 3. etc.
Code:
<p class="inp">First Author <a href="epi1.xhtml#idx104">1</a>, <a href="epi2.xhtml#idx624">2</a></p>
<p class="inp">Second Autor <a href="epi1.xhtml#idx057">1</a>, <a href="epi1.xhtml#idx178">2</a>, <a href="epi1.xhtml#idx241">3</a></p>
I wrote a simple but working Python script which does the job:
It looks like this:
Code:
import re
i = 0
def IncrementalNumbers(m):
global i
i+=1
return str(i)
PageNumbers = r'(\d+)(?=</a>)'
with open("index.xhtml", 'r') as fp, open("index_renumbered.xhtml","w") as out:
# read only one line of the file and apply the transformations
for line in fp:
i = 0
l = re.sub(PageNumbers, IncrementalNumbers, line)
out.write(l)
It would be much faster if I could use this as a plugin, but unfortunately, I have never written a plugin before, and I don't have enough knowledge for it.
My first attempt was only half successful, because the plugin counts globally, not line by line.
Code:
import re
import sys
import sigil_bs4
from bs4 import BeautifulSoup
text_type = str
i = 0
def IncrementalNumbers(m):
global i
i+=1
return str(i)
PageNumbers = r'(\d+)(?=</a>)'
#RefSymbol = '←'
def run(bk):
for (id, href) in bk.text_iter():
print('Start %s:' % href)
html = bk.readfile(id)
soup = sigil_bs4.BeautifulSoup(html)
html_orig = html
html = re.sub(PageNumbers, IncrementalNumbers, html)
if not html == html_orig:
print("Modified File --> ", id)
bk.writefile(id, html)
return 0
def main():
print("I reached main when I should not have\n")
return -1
if __name__ == "__main__":
sys.exit(main())
This is the result:
Second Author 3, 4, 5 instead of
Second Author 1, 2, 3
Code:
<p class="inp">First Author <a href="epi1.xhtml#idx104">1</a>, <a href="epi2.xhtml#idx624">2</a></p>
<p class="inp">Second Autor <a href="epi1.xhtml#idx057">1</a>, <a href="epi1.xhtml#idx178">2</a>, <a href="epi1.xhtml#idx241">3</a></p>
Can someone help me to define the lines in the plugin?
Thank you