View Single Post
Old Today, 06:43 AM   #5
lomkiri
Groupie
lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.
 
lomkiri's Avatar
 
Posts: 174
Karma: 1497966
Join Date: Jul 2021
Device: N/A
Quote:
But I'm not sure how to get the data from the toc.
You have the possibility to make a persistent dict (e.g. mydata) that will survive from one passage of the regex-fucntion to the next passage
You'll have to make 2 different parts in your function: the first one to get the titles from the toc, and the second to make the changes.

Then make 2 passages of the function with different regex on different files :
-- 1st passage (get information from the toc) :
Code:
Displayed file : your toc
find : <p class="toc1"><a href="([^#]+)#([^"]+)" class="toc_text"><strong class="calibre1">(\d+)[^>]+>\s?([^<]+)
scope : current file
You'll get, for each occurrence : :
match[1] -> file name
match[2] -> tag
match[3] -> chap-number
match[4] -> chap title
Store this in you dict mydata, key might be file name or tag, value is a dict, e.g. (tag, chap-num, title)

Then, on the second passage you fill your headers
Code:
find : <div class="fullimage" id="([^"]+)".+/</div>
scope : all text files
The skeleton of your function will be something like this (not tested) :
Code:
mydata = {}    # This dict will survive between 2 passages of the same function
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):

    # first passage: get info and fill mydata
    if file_name == <name of the file holding the toc>  # adapt this:
        mydata[match[1]] = {'tag': match[2], 'num': match[3], 'title': match[4]}
        # you can check the values with a print mydata
        return match[0]
        
    # second passage: replace headers
    if match[1] in mydata:
       chap = mydata[match[1]]
       header =  chap['num'] + ' – ' + chap['title']
       return f'<h1>{header}</h1>'    # adapt this
    else:
       print(f'title not found for file {filename}, tag {match[1]}')
       return match[0]

Last edited by lomkiri; Today at 06:57 AM. Reason: correct some syntax errors in the function
lomkiri is offline   Reply With Quote