Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 05-28-2015, 10:14 PM   #1
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
Q: RegEx to join chapter number and title

I have a bunch of entries with a Hx followed by the rest of the chapter title in <p> tags

Code:
<h1>Chapter 1:*</h1>
<p>The Title</p>
What I'd like is

Code:
<h1>Chapter 1: The Title</h1>
My Find was *</h1>\s*<p>(.*?)</p>

My Replace was \1</h1>

but instead of just the text up to the first </p>, all the text is selected

What am I doing wrong?
phossler is offline   Reply With Quote
Old 05-29-2015, 12:46 AM   #2
doubleshuffle
Unicycle Daredevil
doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.
 
doubleshuffle's Avatar
 
Posts: 13,944
Karma: 185432100
Join Date: Jan 2011
Location: Planet of the Pudding Brains
Device: Aura HD (R.I.P. After six years the USB socket died.) tolino shine 3
No Regex expert here, but just tested this and it seems to do what you want it to do:

Find:
Code:
<h1>(.*)</h1>
<p>(.*)</p>
Replace:
Code:
<h1>\1 \2</h1>
doubleshuffle is offline   Reply With Quote
Advert
Old 05-29-2015, 07:30 AM   #3
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,561
Karma: 204127028
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by phossler View Post
I have a bunch of entries with a Hx followed by the rest of the chapter title in <p> tags

Code:
<h1>Chapter 1:*</h1>
<p>The Title</p>
What I'd like is

Code:
<h1>Chapter 1: The Title</h1>
My Find was *</h1>\s*<p>(.*?)</p>

My Replace was \1</h1>

but instead of just the text up to the first </p>, all the text is selected

What am I doing wrong?
You're including the all of the <p> </p> in your find expression. That's why it's included: <p>(.*?)</p>

I'd suggest capturing the chapter number and the p contents with something like:
Code:
<h1>Chapter (\d+):</h1>\s+<p>([^>]*)</p>
Then replace with something like:
Code:
<h1>Chapter \1: \2</h1>
The * and + tokens in regex are for repetition (* is zero or more, and + is 1 or more). They follow something that may have multiple occurrences. Unless you're actually looking to match an asterisk character (in which case it should be escaped: \*) there's no reason to start an expression with "*". There's no indication of what might be repeating.

Last edited by DiapDealer; 05-29-2015 at 07:42 AM.
DiapDealer is offline   Reply With Quote
Old 05-29-2015, 10:51 AM   #4
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
@doubleshuffle + @DiapDealer --

Thanks for pointing me in the right direction. Much easier to follow

I ended up adding this to my SavedSearches

Find: <h1>(.*?)</h1>\s+<p>([^>]*)</p>
Replace: <h1>\1 <br/>\2</h1>


I went back and I have NO idea how I pasted the '*' into the Find expression in my original question. It wasn't in the Find box, so I'll blame the keyboard???

Last edited by phossler; 05-31-2015 at 09:44 AM.
phossler is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Aura Books not listed by title but just "<title> - <number> - <series>" qee4q Kobo Reader 12 05-10-2015 04:37 PM
A regex function to number a mathematical ebook dmonasse Editor 3 12-23-2014 02:54 AM
Ebook chapter titles: with or without chapter number? amoroso Writers' Corner 16 06-14-2011 06:35 AM
Chapter detection when only digits - regex needed Perkin Calibre 15 09-20-2010 06:25 PM
CSS & regex for chapter titles hpstricker Calibre 3 07-17-2008 10:13 AM


All times are GMT -4. The time now is 09:07 AM.


MobileRead.com is a privately owned, operated and funded community.