View Single Post
Old 07-04-2020, 01:24 AM   #33
Mister L
Groupie
Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Posts: 179
Karma: 91148
Join Date: Jun 2010
Device: Sony 350
Quote:
Originally Posted by Hitch View Post
I do wish to jump in here--if by "professionals" you mean formatters, formatters are absolutely not responsible for title casing. We're not editors or proofreaders and we are not paid to do that work. In fact, if we "forget our place" and do make corrections, we're typically told off for it. I once had to listen to an ass-chewing by a rather jumped-up self-published author, who informed me that if she wanted her book ruined by "a bunch of self-important clerks," she'd hire some.

(Tex here does do that, for one of his clients in particular, but that's a unique situation.)

So, if you happened to have meant formatters, please know that formatters do NOT make those choices. Believe me, at least once a week I get a manuscript with "forward" in it and punctuation outside of quotation marks where it oughtn't be, incorrect emdash use, and on and on and on, but formatters don't earn remotely enough money to also proofread and correct what we see. And trade publishers? They hire out Indian firms, so...fuhgeddaboudit.

Hitch
I'm not talking about title casing at all, although this thread seems to be determined to be about title casing for some reason as people keep getting the impression that is what I'm talking about.

The examples I gave in the first post of the thread are specifically concerning the html code and even more specifically the difference between how chapter titles are presented in the html files vs. how they should be shown in the toc. Using h1 for the chapter number, then h2 for the title, instead of h1 with a br to send the title to the next line, or multiple spans to make fake smallcaps or something like that. I am a formatter too, I know the quality of the text is not my responsibility.

Quote:
Originally Posted by slowsmile View Post
I think in my original post to you I mentioned that you will not get good results with this plugin if your epub is using fake titlecase or fake smallcaps in your headings. In the code above -- you're using fake titlecase. In the above html you're using span classes to capitalize the first letter and then using another span class to make all text after lower case.

However, the good news is that I had another quick look at the above problem and I think I've found a way to resolve it. Try running the new plugin below.
I must have misunderstood when you said the plugin "Gathers all the toc item heading strings from the epub TOC page and puts them into a list." What did you mean by that exactly? What does it do with the list of titles from the toc page? The titles from the toc page are exactly the text I am trying to get, NOT the titles which are in the html pages.

The use of fake smallcaps or other weird code should not matter; the whole point of the plugin is to be used precisely in cases such as that, when it is excessively complicated to extract the title from the h* tags using the habitual methods like regex but there is already a correct toc page in the file. Otherwise it's easy enough to just do a regex for this.

The plugin shouldn't automate the regex; it should not take any text from the html file, no matter what kind of tag it is in, precisely because that text is not presented the way it should be in the toc, whereas the text in the toc page IS already correct. The plugin should copy the titles from the toc page, exactly as they are, and simply PASTE them back to each html file, without modifying the case, or the text itself, in any way. Is it possible to modify this plugin to do that?

If it's not possible, no worries. If it is possible but you don't want to spend more time on it, let me know, and I will try to figure it out myself.

I tried the new version of the plugin, unfortunately it is doing something very strange now, I am not sure exactly what. Here is the result after running it:

Code:
<div title="Le Lion Sur La Collinenonenone*Unprends Les Ronces À Pleines Mains, Et Tu Te Piqueras…Far Dareis Maicar’A’Carncar’A’Carnfais Flèche De Tout Bois, Ou Laisse Les Ténèbres S’Abattre Sur Le Monde…Leitmotivsaidinsaidinsaidinsaidinidemla Seule Façon De Vivre C’Est De Mourir…Je Dois Mourir. La Mort, Voilà Tout Ce Que Je Mérite…Car’A’Carnsaidinsaidinsammael, Rahvin, Moghedien Et…Non, S’Ils Étaient Tous Des Suppôts Des Ténèbres, Tu Les Utiliserais Quand Même.Shoufacar’A’Carncar’A’Carnet Où Qu’Elles Soient… Des Aes Sedai… Au Service De Tous… Mais Le Hall Des Serviteurs Est Détruit, Désormais… Détruit Pour Toujours… Ilyena, Mon*Amour…Car’A’Carnnonenone">
    <h1 id="toc_marker-6">1</h1>
    <h2><span class="Cap">L</span><span class="SmallCap">E LION SUR LA COLLINE</span></h2>
    <p class="Center"><span class="SmallCap"><span><img alt="06jordan-1.jpg" src="../Images/06jordan-1.jpg" width="20%"/></span></span></p>
    <p class="Center"><span class="SmallCap">*</span></p>
    <p>La Roue du Temps tourne et les Âges naissent et meurent, laissant dans leur sillage des souvenirs destinés à devenir des légendes. Puis les légendes se métamorphosent en mythes qui sombrent eux-mêmes dans l’oubli longtemps avant la renaissance de l’Âge qui leur donna le jour.</p>
This text is not the text from the toc file, but it doesn't seem to be the text from the html file either (I left the first paragraph of the chapter so you can see), so I am not sure where it came from.
Mister L is offline   Reply With Quote