Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 07-11-2020, 10:47 AM   #1
playful
Mammal
playful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmos
 
playful's Avatar
 
Posts: 126
Karma: 21380
Join Date: Oct 2010
Location: Right Here
Device: Onyx Note Pro, Kindle DXG
Question [SOLVED] Possible to use <title> tags as TOC headings?

Hi there,

A bit of a Sigil-noobie question.

Working with 1000+ html files.
Each has a proper title tag such as <title>1 - it all began here</title> but no h1, h2 etc. tags.

Out of the box, when generating the TOC, is there a way to make Sigil take the <title> tags into account?

As a workaround, I've added some <h1> tags to the 1000+ pages with a simple regex-replace, but that's not really my first choice.

Thanks!

p.s.
For reference if someone on the same track needs that workaround:
Search: (?s)(<title>([^<]+).*?<body>)
Replace: \1\n<h1>\2</h1>\n

Last edited by playful; 07-11-2020 at 11:50 AM. Reason: Solved
playful is offline   Reply With Quote
Old 07-11-2020, 10:49 AM   #2
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 80,685
Karma: 150249619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
The title tags would normally have the book's title and not the chapter title.
JSWolf is offline   Reply With Quote
Old 07-11-2020, 11:03 AM   #3
playful
Mammal
playful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmos
 
playful's Avatar
 
Posts: 126
Karma: 21380
Join Date: Oct 2010
Location: Right Here
Device: Onyx Note Pro, Kindle DXG
Quote:
Originally Posted by JSWolf View Post
The title tags would normally have the book's title and not the chapter title.
Alright, let me explain. I'm converting a website to an ebook. The website has 1000+ pages. It's standard for the html pages of websites to each have their own <title> tag.

Looking at toc.ncx, I can easily generate that in Python.
After generating the <h1> tags as mentioned above, now that I have the TOC, I can easily strip the <h1> tags again.

Just wondering if there is a way to do it out of the box, for future reference.

Thanks!
playful is offline   Reply With Quote
Old 07-11-2020, 11:32 AM   #4
Turtle91
A Hairy Wizard
Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.
 
Turtle91's Avatar
 
Posts: 3,394
Karma: 20212733
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
No, I don't think Sigil has a built-in function for that. However, you can save that regex and easily have access to it in the future (under Tools/Saved Searches).

As for the regex in your example. It is more complex than my reg-fu can easily decipher...I have some homework to do!
Turtle91 is offline   Reply With Quote
Old 07-11-2020, 11:43 AM   #5
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,868
Karma: 207000000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
No out of the box functionality like that, no.

If you know python, you could probably do a quick plugin to build an ncx from title tag values. Or you could have it insert nodisplay h tags into the top of all html files with the title tag values added as content (or title attributes) with your above regexp. The latter would allow you to use Sigil's built-in ncx generator.
DiapDealer is online now   Reply With Quote
Old 07-11-2020, 11:50 AM   #6
playful
Mammal
playful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmos
 
playful's Avatar
 
Posts: 126
Karma: 21380
Join Date: Oct 2010
Location: Right Here
Device: Onyx Note Pro, Kindle DXG
@Turtle91, @DiapDealer,

Thank you for the great ideas! (saving the search, the no-display h tags…)

Solved!
playful is offline   Reply With Quote
Old 07-12-2020, 07:25 PM   #7
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by playful View Post
A bit of a Sigil-noobie question.

Working with 1000+ html files.
Each has a proper title tag such as <title>1 - it all began here</title> but no h1, h2 etc. tags.

Out of the box, when generating the TOC, is there a way to make Sigil take the <title> tags into account?
You could do this with 2 Find/Replaces. One to insert the <title> as <h1>, and one to remove the <h1>.

Note: Make sure you turn Sigil into Regex mode, and make sure you check the box for Dot All.

* * *

Step 1: Convert <title> into an <h1>:

Find: <title>(.+?)</title>(.+)<body>
Replace: <title>\1</title>\2<body><h1>\1</h1>

Step 2: Press Tools > Table of Contents > Generate Table of Contents, then create your TOC.

Step 3: Do the opposite. Remove the <h1> we just created:

Find: <body><h1>(.+?)</h1>
Replace: <body>

If you do this often, you can even create a Saved Search for both of those fixes.

Quote:
Originally Posted by JSWolf View Post
The title tags would normally have the book's title and not the chapter title.
No. For more extensive discussion on best practices for <title>, see discussion in the 2018 thread "Two questions", especially my Post #2.

Last edited by Tex2002ans; 07-12-2020 at 07:31 PM.
Tex2002ans is offline   Reply With Quote
Old 07-12-2020, 09:13 PM   #8
playful
Mammal
playful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmosplayful has become one with the cosmos
 
playful's Avatar
 
Posts: 126
Karma: 21380
Join Date: Oct 2010
Location: Right Here
Device: Onyx Note Pro, Kindle DXG
@Tex2002ans

Hey there, it looks to me that you missed this entire part of my original post:

Quote:
Originally Posted by playful View Post
Hi there,
As a workaround, I've added some <h1> tags to the 1000+ pages with a simple regex-replace, but that's not really my first choice.

Thanks!

p.s.
For reference if someone on the same track needs that workaround:
Search: (?s)(<title>([^<]+).*?<body>)
Replace: \1\n<h1>\2</h1>\n
But thanks for your kind message anyway!
playful is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to insert automatically valid id's in ToC headings? chaot Editor 18 02-01-2017 06:55 AM
NCX TOC section headings eggheadbooks1 ePub 13 06-08-2013 05:57 PM
Smashwords formatting issues with headings, TOC and images amoroso Writers' Corner 9 06-01-2012 04:59 PM
Issue With Chapter Headings and TOC yoss15 Kindle Formats 5 02-07-2012 01:54 PM


All times are GMT -4. The time now is 01:54 PM.


MobileRead.com is a privately owned, operated and funded community.