View Full Version : retagging chapter heads?


affa
04-19-2011, 03:06 PM
EDIT - solved this myself, second post in case someone needs it

I've spent the past 72 hours playing with Indesign, Sigil, and Calibre, trying to get my print books to a valid epub format with varying degrees of success.

I've run into a couple walls, the first of which is identifying chapters.

After exporting to Epub from Indesign (slightly simplified / rearranged from original print version), if I open the epub in Sigil I have one huge file and the chapter heads look like:

<p class="chapter-head"><span class="generated-style">This is a Chapter Name</span></p>

But if I understand everything correctly, and from reading this forum, it would be greatly preferable for them to be

<hr class="sigilChapterBreak" />
<h2>This is a Chapter Name</hr>

As this would then allow me to use Sigil to split on chapter breaks, properly divide my ebook up, and also generate a proper TOC.

Now, my problem is that while it's easy enough to search and replace the first half of that markup, the second half
</span></p>

isn't so simple because it's not unique and is found everywhere.

Is there any way to do a Regular Expression search/replace that will take the above markup, with some sort of wildcard for the chaptername, that will return the new markup with the original Chaptername? I figured out how to do this in GREP once for something else, but I'm currently just lost. Then, I can just run this search replace on all of my books, and it would be of massive help to me.

If someone has a completely alternate suggestion, I'm all ears as well, but this seems to be the quickest and makes the most sense. I'd even be ok with doing the search and replace using some other tool if necessary (unzipping the epub) but was hoping I could do it inside of Sigil for obvious reasons.

affa
04-19-2011, 03:37 PM
figured it out myself

SEARCH
<p class="chapter-head"><span class="generated-style">(.*)</span></p>

REPLACE
<hr class="sigilChapterBreak" /><h2>\1</h2>

Faster
04-19-2011, 03:55 PM
Here it is for ANY class and ANY span:


Find: <p[^>]*><span[^>]*>((Chapt|CHAPT)[^</]*)</span></p>

Replace: <hr class="sigilChapterBreak" /><h2>\1</h2>