MobileRead Forums - View Single Post - Add title="" to h* based on existing TOC -- suggestion for new feature (or plugin?)

Mister L · 07-07-2020, 11:18 PM

Quote:

Originally Posted by DNSB

My code—in theory—could pull from an epub2 toc.ncx, an epub3 nav.xhtml document or a html table of contents. The problem was the sheer number of special cases that had me wasting more and more time modifying the code as the complexity increased, time that I realized was taking longer than my manual process. I also ran into too many issues where trying to fix the code to work with one ebook broke it for a previously working ebook. Regressions 'Я Us.

Like most programming tasks, it is simple for the person who is not trying to implement it. For the person who is trying to implement it, you find yourself looking for a larger can so all the worms will fit back in.

"All the necessary elements are already in the file"? Bah, humbug. The issues are more that the structure of the epub is different. Even things like where the files are stored in the epub can be a PITA as in recent epub I edited where the text files were partly stored in the root of the archive and partly in a text folder.

Very interesting to know! From your previous post I had the impression you were not starting from the toc files, sorry for the confusion. If you don't mind my asking, did you try just pasting the toc title into an html comment, and then going back with a regex to move it to where it belonged (or even do that by hand, which would still be easier if the title was already in the right page), rather than making a plugin to do *all* the steps including insert a title attribute? Do you think, based on your experience, that if you only wanted to paste an html comment with the title, at the destination of the link, that would be possible to do without too much risk? I googled "how hard is it to learn Python" and all the answers said "super easy" (lol

) but at this point I've put enough energy into thinking about this stupid thing (plus that 14-book collection I did recently was really a gigantic pain in the a**) that I'm really pretty tempted to try and learn Python if I can manage to find the time, so I can break a few files myself before I give up completely.

Quote:

Originally Posted by slowsmile

@Mister L...I have some good reasons why I am so frustrated with your plugin requirements. These reasons are:

1. In your post above that contains your bullet-point spec you said that you wanted to transform or combine the headings of h1 and h2 in your xhtml files using the NCX TOC headings. So after 3 whole pages of telling me what you want in your plugin, you then -- on the 4th page -- mention for the first time ever in that spec that you want to combine h1 and h2 for certain headings only. That p*issed me off because that single requirement means that I will have to redesign the plugin more or less from scratch again.

This is what I meant when I said we were talking at cross purposes. It is very unfortunate that you got the wrong idea, and I am sorry you were frustrated, but I never said this. In post 47, which I think is what you mean by the bullet-point spec, I attached a sample test file and explained what the results should be based on the various cases in that file. I specifically said:

Quote:

The chapters have a title split into 2 parts: h1 with the chapter number + h2 with the title, which has fake smallcaps. Chapter 6 has fake smallcaps with a bonus capitalised word for maximum span kludge fun. These are referenced in the toc as "1. Title of the chapter" (in sentence case; Chapter 6 is "Title of the chapter with Propernoun"). The toc reference should be copied exactly as it is (number followed by a period then the title, with zero modifications to the text or the case) to a title="" attribute in the h1 OR an html comment immediately above the tag with the toc ID, whichever is easiest to code.

Add the toc version to a title="" attribute (or html comment), NOT "modify the h1 and h2 headings to match the toc". I also offered to make a second file which would show the results I was hoping for after running the plugin.

I tried very hard to explain precisely what I meant, which was to copy the text FROM THE NCX file into title="" attributes in the html file, WITHOUT modifying the html headings. I gave examples of why I might want to do this, which included cases where the titles as displayed in the NCX files are *already* combined from h1 and h2 tags. I gave these explanations multiple times in this thread starting in the first post, but specifically in post 28 I gave examples of code showing the html, the toc code, and the desired result:

Quote:

After running the plugin the result I expected was:

Code:

<h1 id="toc_marker-6" title="1. Le lion sur la colline">1</h1>

<h2><span class="Cap">L</span><span class="SmallCap">E LION SUR LA COLLINE</span></h2>

The blue code is the code I wanted to add, ie the exact title as displayed in the toc, without modifying in any way the original headings in the html file (except to add the title="" attribute).

Quote:

Originally Posted by slowsmile

2. Your requirement that your xhtml headings must look exactly the same as NCX headings really surprises me. Why do you insist on that? In my ebooks my xhtml headings will look like this:

CHAPTER 1

The Tiger Steps Out

...but my NCX TOC headings will look like this:

Chapter 1 ~ The Tiger Steps Out

I don't change my xhtml headings.

Your apparent insistence on the xhtml headings being combined and exactly the same as the NCX headings makes more [unnecessary] work for me wrt the plugin. I also can't see any good reasons for that requirement since those two different heading formats that I use(shown above) should'nt cause any confusion whatsoever for any reader as far as I can see.

I did not say this either. I specifically said, in these files, the titles in the TOC are not the same as the titles displayed in the html pages (exactly like in your example here, among other examples), that is why I need to copy the TOC titles into a title="" attribute (or html comment) so that when I regenerate the TOC after modifying the files, the TOC version of the titles is not lost. I did not want to modify the headings in the html pages.

I am very sorry if this was not clear to you but I tried my best to explain it, multiple times, starting in the first post of the thread, including directly in response to your posts, with examples of the results I was hoping for. I don't think I could explain more clearly than with the example code I included in post 28. I wish, if you did not understand, you had asked me to clarify certain points before continuing, rather than just going ahead. Frankly this was very frustrating for me too.