MobileRead Forums - View Single Post - Q: RegEx to join chapter number and title

DiapDealer · 05-29-2015, 07:30 AM

Quote:

Originally Posted by phossler

I have a bunch of entries with a Hx followed by the rest of the chapter title in <p> tags

Code:

<h1>Chapter 1:*</h1>
<p>The Title</p>

What I'd like is

Code:

<h1>Chapter 1: The Title</h1>

My Find was *</h1>\s*<p>(.*?)</p>

My Replace was \1</h1>

but instead of just the text up to the first </p>, all the text is selected

What am I doing wrong?

You're including the all of the <p> </p> in your find expression. That's why it's included: <p>(.*?)</p>

I'd suggest capturing the chapter number and the p contents with something like:

Code:

<h1>Chapter (\d+):</h1>\s+<p>([^>]*)</p>

Then replace with something like:

Code:

<h1>Chapter \1: \2</h1>

The * and + tokens in regex are for repetition (* is zero or more, and + is 1 or more). They follow something that may have multiple occurrences. Unless you're actually looking to match an asterisk character (in which case it should be escaped: \*) there's no reason to start an expression with "*". There's no indication of what might be repeating.