MobileRead Forums - View Single Post - Add title="" to h* based on existing TOC -- suggestion for new feature (or plugin?)

Mister L · 06-21-2020, 01:43 PM

Quote:

Originally Posted by KevinH

A plugin might be best for this case.

Just so that everyone is on the same page ...

It would take an existing nav or ncx,, follow the links back to the target file and element, add a title attribute to it (remembering to html escape any text) based on the current TOC. If existing link is to top of file, inject a new h1 tag with nodisplay set on it with that title.

The idea is that after running this plugin, you should be able to regenerate the TOC from h tags in Sigil and get something very very close to the original TOC back.

Is that correct?

KevinH

Yes that is exactly right. Would it be difficult to make a plugin for that?

Quote:

Originally Posted by Doitsu

The problem is that heading formats aren't predictable. You yourself gave two examples. In the first example, the heading consisted of two <h1> tags and in the second example it consisted of <h1> and <h2> tags.

BTW, both problems can be easily fixed with the right regular expressions. For example, you could use the following expressions to merge the two <h1> tags:

Find:<h1 epub:type="title" class="part_n"><span>(\d+)</span></h1>\s+<h1 epub:type="title" class="part_tit"><span>(.*?)</span></h1>
Replace:<h1 epub:type="title" class="part_n" title="\1: \2"><span>\1</span><br /><span class="part_tit">\2</span></h1>

If you process the first heading format with it and then generate the TOC, Sigil will add the following entry:

4: The Whale speaks of what she has learned about humans

Yes, so far I have been relying on regex for these cases (and if you have a regex for the fake smallcaps example I'd love to know it, I did that one last week and ended up just copying over the titles by hand). But precisely because they are not predictable, I have to figure out a new regex every time depending on the specific characteristics of the file rather than just having a saved search I can run, and it can be very time-consuming (especially as my regex skills are somewhat limited), and it's a bit frustrating knowing that the exact information needed is already in the book but not easily exploited. I've had a whole series of these cases recently (and another very big one to do this week) which is why I started to think there must be a better way to do it.