11-30-2017, 02:01 AM | #1 |
just an egg
Posts: 1,586
Karma: 4300000
Join Date: Mar 2015
Device: Kindle, iOS
|
convert <p> to <h1>
Too often an epub uses <p> for chapter titles
Any suggestions on how to convert all those <p> to <h1> other than a manual edit? (I tried Tag Mechanic but couldn't figure out a way to do it that way.) Or is there a way for those <p> to be recognized when generating a TOC? Finally, (a bit off-topic of Sigil, sorry), any suggestions from Mac users for a good HTML editor for Mac? I currently use and love Taco, but the developer closed shop years ago, and with every OS update, I fear it will stop working. In fact, I'm still on El Capitan because I'm afraid I'll lose Taco if I update to Sierra or High Sierra. Thanks! |
11-30-2017, 02:12 AM | #2 |
Unicycle Daredevil
Posts: 13,923
Karma: 185041098
Join Date: Jan 2011
Location: Planet of the Pudding Brains
Device: Aura HD (R.I.P. After six years the USB socket died.) tolino shine 3
|
If your heading has only the <p> tags, you're out of luck, I guess. But usually there are some style tags that are unique for headings in a particular book, so you can use those to build a regex that finds your headings.
Last edited by doubleshuffle; 11-30-2017 at 03:16 AM. |
11-30-2017, 05:42 AM | #3 |
Grand Sorcerer
Posts: 27,550
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
That's what I'd do, too. If actually changing the tags with regex proves tricky, non-header chapter titles are usually near the top of a page and often have a consistent pattern that can be used to match them (immediately follows the body tag, nested in a div following the body, etc). Find the pattern and give all of those chapter-title p tags (or most of them) a specific class and then safely change them to header tags with Tag Mechanic.
|
11-30-2017, 10:28 AM | #4 |
A Hairy Wizard
Posts: 3,095
Karma: 18727053
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
^^^ What they said.
Here's an example: Code:
search: <body>\s*<p>Chapter (.*?)</p> replace: <body>\n<h1>Chapter \1</h1>\n Code:
<h1>Title Page</h1> <h2>maps</h2> <h2>epigraph</h2> <h2>Part 1</h2> <h3>Chapter 1</h3> <h3>Chapter 2</h3> <h3>Chapter 3</h3> <h2>Part 2</h2> <h3>Chapter 4</h3> <h3>Chapter 5</h3> <h3>Chapter 6</h3> Code:
Title Page maps epigraph Part 1 Chapter 1 Chapter 2 Chapter 3 Part 2 Chapter 4 Chapter 5 Chapter 6 Last edited by Turtle91; 11-30-2017 at 10:30 AM. |
11-30-2017, 12:43 PM | #5 | |
just an egg
Posts: 1,586
Karma: 4300000
Join Date: Mar 2015
Device: Kindle, iOS
|
Quote:
I'm trying to learn regex, but it's been slow going as regex tends to make my brain curl up into a fetal ball and start whimpering. Finding the chapter headings hasn't been a problem. I search: <p class="chapterHead"> replace: <h1 class="chapterHead"> But then I would go in and manually change each closing </p> tag to an </h1> Urgh. I've tried search: <p class="chapterHead">.*?</p> replace <h1 class="chapterHead">.*?</h1> but that had unhappy results. Turtle91 for the proper regex to automate the entire process! |
|
11-30-2017, 01:58 PM | #6 | |
Grand Sorcerer
Posts: 27,550
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
By default, the only thing the plugin will let you change a p tag to is a div tag. But you can change that in the plugin's customization config. Just right-click anywhere on the plugin's GUI dialog (the above image) and select "Customize Plugin" from the menu. Then add h1, h2, h3, etc.. (comma separated) to the list of tags available for p tag manipulation and click "Apply & Close." NOTE: make sure you have all the relevant xhtml files highlighted in Sigil's Book Browser before launching the plugin. The plugin only processes/searches/affects those files which are selected. Last edited by DiapDealer; 11-30-2017 at 02:33 PM. |
|
11-30-2017, 02:36 PM | #7 | |
just an egg
Posts: 1,586
Karma: 4300000
Join Date: Mar 2015
Device: Kindle, iOS
|
Quote:
I loved Tag Mechanic before, but now I love it even more! (And once my brain uncurls out of fetal position, I will continue my attempts to learn more regex ) |
|
11-30-2017, 02:58 PM | #8 |
Grand Sorcerer
Posts: 27,550
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Happy to help.
|
11-30-2017, 07:21 PM | #9 | |
Grand Sorcerer
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
Obviously this only works with epub2 books that have a valid (=working) NCX TOC and it works best with NCX TOC entries that reference fragment ids (e.g. file.xhtml#id2). BTW, it does the following: 1. It parses the NCX and generates a list of TOC entries. 2. If the target href has a fragment id, it'll look for the target tag based on the fragment id and change the tag (or its parent tag) to a heading tag. (=Best-case scenario.) 3. If the target href doesn't have a fragment id, it'll look for the first tag with the same text as the TOC entry. 4. If the previous step failed, it'll insert a dummy heading tag with the TOC entry as the title attribute. This is the worst-case scenario, but at least it'll allow you to generate a TOC with Sigil. Since I only tested this plugin with a couple of old valid Calibre-generated books with working NCX TOCs but no heading tags, I don't feel comfortable releasing it, but if you want to test this beta version, PM me for a Dropbox link. |
|
11-30-2017, 07:38 PM | #10 |
Witchman
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
|
@odamizu...By any chance are you using Scrivener to convert to epub? I only mention this because Scrivener always uses <p> tags for main headings whenever you convert to epub.
If you are converting to epub using Scrivener then you can use my NormalizeScrivEpub plugin. This plugin, by default, automatically converts all your main <p> tag headings to <h1> tag headings. Only takes a couple of seconds and you don't have to use regex. Last edited by slowsmile; 11-30-2017 at 08:14 PM. |
11-30-2017, 11:56 PM | #11 | |
just an egg
Posts: 1,586
Karma: 4300000
Join Date: Mar 2015
Device: Kindle, iOS
|
I should have known Tag Mechanic could do this I've been happily using it to get rid of those pesky empty spans, then came across some <p> chapter heads I wanted to change to <h1>, but when I went to Tag Mechanic, the option wasn't there. How foolish of me not to research further — after all, the customization instructions are right there in your first post! (which I, of course, forgot about after downloading the plug-in to use with empty spans) Thanks again for the great plug-in and for all your work on Sigil!
Quote:
Oy. I don't even know what Scrivener is! But thanks for the offer! |
|
12-01-2017, 02:08 AM | #12 |
Grand Sorcerer
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Please don't downscale or convert my images, ebook-convert ! | nylnook | Conversion | 3 | 02-11-2016 06:08 AM |
To convert or not to convert - PDF | marmistrz | Workshop | 7 | 06-20-2013 12:03 PM |
How to batch-convert with ebook-convert? | cypresstwist | Conversion | 8 | 02-22-2011 09:28 AM |