I wish I could understand what you mean lol. So the problem is on the chapter pages there is no text only a html code for a graphic
<div class="fullimage" id="DB7S1-f66f6b0d51c44ea49012bf2fe61db1ae"><img alt="" src="../images/00021.jpeg" class="calibre3"/></div>
The only place in the book that has the text is the TOC (page0003.html) with html like this:
Code:
<p class="toc1"><a href="part0013.html#CCNA1-f66f6b0d51c44ea49012bf2fe61db1ae" class="toc_text"><strong class="calibre1">9. </strong> The Chickens Draw First Blood</a></p>
<p class="toc1"><a href="part0014.html#DB7S1-f66f6b0d51c44ea49012bf2fe61db1ae" class="toc_text"><strong class="calibre1">10. </strong> My Singing Makes Things Worse, and Everyone Is Totally Shocked</a></p>
I can do a search on the following to get the ID
Code:
<div class="fullimage" id="([^"]+)"><img alt="" src="../images/000\d+.jpeg" class="calibre3"/></div>
But I'm not sure how to get the data from the toc.
I definitely want the function to return
Code:
return f'<h2 class="chapter-heading">{new_chapter_title}</h2>'
where the 'new_chapter_title' contains the chapter number and the chapter title
UPDATE:
It seems like this line needs to be adjusted since I'm not using (<h[1|2|3]) in the Search?
Code:
tag_name, anchor, text = match.group(1), replace_entities(match.group(2)), replace_entities(match.group(3))