Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 11-30-2017, 02:01 AM   #1
odamizu
just an egg
odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.
 
odamizu's Avatar
 
Posts: 1,586
Karma: 4300000
Join Date: Mar 2015
Device: Kindle, iOS
convert <p> to <h1>

Too often an epub uses <p> for chapter titles

Any suggestions on how to convert all those <p> to <h1> other than a manual edit? (I tried Tag Mechanic but couldn't figure out a way to do it that way.)

Or is there a way for those <p> to be recognized when generating a TOC?

Finally, (a bit off-topic of Sigil, sorry), any suggestions from Mac users for a good HTML editor for Mac? I currently use and love Taco, but the developer closed shop years ago, and with every OS update, I fear it will stop working. In fact, I'm still on El Capitan because I'm afraid I'll lose Taco if I update to Sierra or High Sierra.

Thanks!
odamizu is offline   Reply With Quote
Old 11-30-2017, 02:12 AM   #2
doubleshuffle
Unicycle Daredevil
doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.
 
doubleshuffle's Avatar
 
Posts: 13,923
Karma: 185041098
Join Date: Jan 2011
Location: Planet of the Pudding Brains
Device: Aura HD (R.I.P. After six years the USB socket died.) tolino shine 3
If your heading has only the <p> tags, you're out of luck, I guess. But usually there are some style tags that are unique for headings in a particular book, so you can use those to build a regex that finds your headings.

Last edited by doubleshuffle; 11-30-2017 at 03:16 AM.
doubleshuffle is offline   Reply With Quote
Old 11-30-2017, 05:42 AM   #3
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,550
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by doubleshuffle View Post
If your heading has only the <p> tags, you're out of luck, I guess. But usually there are some style tags that are unique for headings in a particular book, so you can use those to build a regex that finds your headings.
That's what I'd do, too. If actually changing the tags with regex proves tricky, non-header chapter titles are usually near the top of a page and often have a consistent pattern that can be used to match them (immediately follows the body tag, nested in a div following the body, etc). Find the pattern and give all of those chapter-title p tags (or most of them) a specific class and then safely change them to header tags with Tag Mechanic.
DiapDealer is offline   Reply With Quote
Old 11-30-2017, 10:28 AM   #4
Turtle91
A Hairy Wizard
Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.
 
Turtle91's Avatar
 
Posts: 3,095
Karma: 18727053
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
^^^ What they said.

Here's an example:

Code:
search: <body>\s*<p>Chapter (.*?)</p>
replace: <body>\n<h1>Chapter \1</h1>\n
...although I wouldn't use <h1> for a chapter title. I'd use <h2> or even <h3>. Sigil will automagically indent your TOC based on the level of the header tag you use. For example:

Code:
<h1>Title Page</h1>
<h2>maps</h2>
<h2>epigraph</h2>
<h2>Part 1</h2>
<h3>Chapter 1</h3>
<h3>Chapter 2</h3>
<h3>Chapter 3</h3>
<h2>Part 2</h2>
<h3>Chapter 4</h3>
<h3>Chapter 5</h3>
<h3>Chapter 6</h3>
Would look like this (subject to your css styling if you also make an inline TOC)
Code:
Title Page
     maps
     epigraph
     Part 1
          Chapter 1
          Chapter 2
          Chapter 3
     Part 2
          Chapter 4
          Chapter 5
          Chapter 6

Last edited by Turtle91; 11-30-2017 at 10:30 AM.
Turtle91 is offline   Reply With Quote
Old 11-30-2017, 12:43 PM   #5
odamizu
just an egg
odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.
 
odamizu's Avatar
 
Posts: 1,586
Karma: 4300000
Join Date: Mar 2015
Device: Kindle, iOS
Quote:
Originally Posted by Turtle91 View Post
Code:
search: <body>\s*<p>Chapter (.*?)</p>
replace: <body>\n<h1>Chapter \1</h1>\n
Omigod, you guys are fabulous!

I'm trying to learn regex, but it's been slow going as regex tends to make my brain curl up into a fetal ball and start whimpering.

Finding the chapter headings hasn't been a problem. I search: <p class="chapterHead"> replace: <h1 class="chapterHead">

But then I would go in and manually change each closing </p> tag to an </h1> Urgh.

I've tried search: <p class="chapterHead">.*?</p> replace <h1 class="chapterHead">.*?</h1> but that had unhappy results.

Turtle91 for the proper regex to automate the entire process!
odamizu is offline   Reply With Quote
Old 11-30-2017, 01:58 PM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,550
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by odamizu View Post
Finding the chapter headings hasn't been a problem. I search: <p class="chapterHead"> replace: <h1 class="chapterHead">

But then I would go in and manually change each closing </p> tag to an </h1> Urgh.
If the chapter headings are that easy to find, then Tag Mechanic should have been able to do want you want fairly painlessly. Just change all p tags with a class attribute value of "chapterHead" to h1. That would have taken care of the opening and closing tags for you -- not that I would dream of discouraging anyone from picking up more regex experience.

Click image for larger version

Name:	tagmech.jpg
Views:	508
Size:	28.4 KB
ID:	160305

By default, the only thing the plugin will let you change a p tag to is a div tag. But you can change that in the plugin's customization config. Just right-click anywhere on the plugin's GUI dialog (the above image) and select "Customize Plugin" from the menu. Then add h1, h2, h3, etc.. (comma separated) to the list of tags available for p tag manipulation and click "Apply & Close."

Click image for larger version

Name:	tagmech2.jpg
Views:	174
Size:	67.3 KB
ID:	160307

NOTE: make sure you have all the relevant xhtml files highlighted in Sigil's Book Browser before launching the plugin. The plugin only processes/searches/affects those files which are selected.

Last edited by DiapDealer; 11-30-2017 at 02:33 PM.
DiapDealer is offline   Reply With Quote
Old 11-30-2017, 02:36 PM   #7
odamizu
just an egg
odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.
 
odamizu's Avatar
 
Posts: 1,586
Karma: 4300000
Join Date: Mar 2015
Device: Kindle, iOS
Quote:
Originally Posted by DiapDealer View Post
... By default, the only thing the plugin will let you change a p tag to is a div tag. But you can change that in the plugin's customization config.
Omigod!

I loved Tag Mechanic before, but now I love it even more!



(And once my brain uncurls out of fetal position, I will continue my attempts to learn more regex )
odamizu is offline   Reply With Quote
Old 11-30-2017, 02:58 PM   #8
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,550
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Happy to help.
DiapDealer is offline   Reply With Quote
Old 11-30-2017, 07:21 PM   #9
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by odamizu View Post
Any suggestions on how to convert all those <p> to <h1> other than a manual edit?
Some time ago I slapped together a quick & dirty plugin that'll automatically change paragraph tags to heading tags based on NCX TOC entries.

Obviously this only works with epub2 books that have a valid (=working) NCX TOC and it works best with NCX TOC entries that reference fragment ids (e.g. file.xhtml#id2). BTW, it does the following:

1. It parses the NCX and generates a list of TOC entries.
2. If the target href has a fragment id, it'll look for the target tag based on the fragment id and change the tag (or its parent tag) to a heading tag. (=Best-case scenario.)
3. If the target href doesn't have a fragment id, it'll look for the first tag with the same text as the TOC entry.
4. If the previous step failed, it'll insert a dummy heading tag with the TOC entry as the title attribute. This is the worst-case scenario, but at least it'll allow you to generate a TOC with Sigil.

Since I only tested this plugin with a couple of old valid Calibre-generated books with working NCX TOCs but no heading tags, I don't feel comfortable releasing it, but if you want to test this beta version, PM me for a Dropbox link.
Doitsu is offline   Reply With Quote
Old 11-30-2017, 07:38 PM   #10
slowsmile
Witchman
slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.
 
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
@odamizu...By any chance are you using Scrivener to convert to epub? I only mention this because Scrivener always uses <p> tags for main headings whenever you convert to epub.

If you are converting to epub using Scrivener then you can use my NormalizeScrivEpub plugin. This plugin, by default, automatically converts all your main <p> tag headings to <h1> tag headings. Only takes a couple of seconds and you don't have to use regex.

Last edited by slowsmile; 11-30-2017 at 08:14 PM.
slowsmile is offline   Reply With Quote
Old 11-30-2017, 11:56 PM   #11
odamizu
just an egg
odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.
 
odamizu's Avatar
 
Posts: 1,586
Karma: 4300000
Join Date: Mar 2015
Device: Kindle, iOS
Quote:
Originally Posted by DiapDealer View Post
Happy to help.
I should have known Tag Mechanic could do this I've been happily using it to get rid of those pesky empty spans, then came across some <p> chapter heads I wanted to change to <h1>, but when I went to Tag Mechanic, the option wasn't there. How foolish of me not to research further — after all, the customization instructions are right there in your first post! (which I, of course, forgot about after downloading the plug-in to use with empty spans) Thanks again for the great plug-in and for all your work on Sigil!

Quote:
Originally Posted by Doitsu View Post
Some time ago I slapped together a quick & dirty plugin that'll automatically change paragraph tags to heading tags based on NCX TOC entries.

Obviously this only works with epub2 books that have a valid (=working) NCX TOC and it works best with NCX TOC entries that reference fragment ids (e.g. file.xhtml#id2).

... if you want to test this beta version, PM me for a Dropbox link.
Tag Mechanic (with Turtle91's regex lesson as a backup) has solved my problem. But if you'd like me to test your plug-in, I'd be happy to. Will it also work with an epub3s that have an NCX TOC, or just epub2s?

Quote:
Originally Posted by slowsmile View Post
@odamizu...By any chance are you using Scrivener to convert to epub? ... If you are converting to epub using Scrivener then you can use my NormalizeScrivEpub plugin ...
Oy. I don't even know what Scrivener is! But thanks for the offer!
odamizu is offline   Reply With Quote
Old 12-01-2017, 02:08 AM   #12
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by odamizu View Post
But if you'd like me to test your plug-in, I'd be happy to. Will it also work with an epub3s that have an NCX TOC, or just epub2s?
I haven't tested this, but if you select Tools > Epub3 Tools > Generate NCX from Nav it should work.
Doitsu is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Please don't downscale or convert my images, ebook-convert ! nylnook Conversion 3 02-11-2016 06:08 AM
To convert or not to convert - PDF marmistrz Workshop 7 06-20-2013 12:03 PM
How to batch-convert with ebook-convert? cypresstwist Conversion 8 02-22-2011 09:28 AM


All times are GMT -4. The time now is 11:17 AM.


MobileRead.com is a privately owned, operated and funded community.