Quote:
|
But I'm not sure how to get the data from the toc.
|
You have the possibility to make a persistent dict (e.g. mydata) that will survive from one passage of the regex-fucntion to the next passage
You'll have to make 2 different parts in your function: the first one to get the titles from the toc, and the second to make the changes.
Then make 2 passages of the function with different regex on different files :
-- 1st passage (get information from the toc) :
Code:
Displayed file : your toc
find : <p class="toc1"><a href="([^#]+)#([^"]+)" class="toc_text"><strong class="calibre1">(\d+)[^>]+>\s?([^<]+)
scope : current file
You'll get, for each occurrence : :
match[1] -> file name
match[2] -> tag
match[3] -> chap-number
match[4] -> chap title
Store this in you dict mydata, key might be file name or tag, value is a dict, e.g. (tag, chap-num, title)
Then, on the second passage you fill your headers
Code:
find : <div class="fullimage" id="([^"]+)".+/</div>
scope : all text files
The skeleton of your function will be something like this (not tested) :
Code:
mydata = {} # This dict will survive between 2 passages of the same function
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):
# first passage: get info and fill mydata
if file_name == <name of the file holding the toc> # adapt this:
mydata[match[1]] = {'tag': match[2], 'num': match[3], 'title': match[4]}
# you can check the values with a print mydata
return match[0]
# second passage: replace headers
if match[1] in mydata:
chap = mydata[match[1]]
header = chap['num'] + ' – ' + chap['title']
return f'<h1>{header}</h1>' # adapt this
else:
print(f'title not found for file {filename}, tag {match[1]}')
return match[0]