Quote:
Originally Posted by Mister L
[...] or have a different title in the TOC to the one displayed in the page, such as the portrait of the author, the copyright page, the cover, the title page which is called "Title page" in the TOC but obviously not in the page, the "By the same author" / bibliography page, Publisher catalogue page...
|
The real question is: Does this belong in a TOC at all?
I would
strongly lean towards No.
Quote:
Originally Posted by Mister L
No choice for those cases but to do it all by hand.
|
Sometimes that's what you have to do. Especially if you get some hideous code that's inconsistent spaghetti gobbledeegook like you brought up in this thread.
I'm going to pull a JSWolf and say clean the code up and make it consistent first, then your life will be much easier with the Regex going forward.
* * *
On your Title Casing problem. There are a few solutions, but I've found almost all the be suboptimal and have their own issues on edge cases.
Back in 2014, I used this Regex:
https://www.mobileread.com/forums/sh...53#post2930153
https://www.mobileread.com/forums/sh...d.php?t=233018
(I still use similar nowadays.)
Calibre introduced a "Function Mode" and even has an entire section dedicated in the manual for it,
"Automatically fixing the case of headings in the document".
But most of the solutions I've come across the years don't take into account the nuances needed for proper Title Casing (different Style Guides require different rules).
This is the site I use:
https://capitalizemytitle.com/
It handles title casing better than many of the other tools I've run across over the years... and it does handle edge cases like caps after : or EM DASH.
But you always get stuff like: DNA, RNA, mRNA, First/Last names (DeSanto, McDonald), etc.