View Single Post
Old 05-07-2014, 01:55 AM   #1
timberbeast
stumblebum
timberbeast began at the beginning.
 
timberbeast's Avatar
 
Posts: 29
Karma: 10
Join Date: Nov 2013
Location: Roseburg, OR
Device: kindle2
Thumbs up A little help with a regex please, if you don't mind?

First things first. Thank you very much, Kovid, for your top notch program. Calibre is powerful as heck, and a lot of fun to use. Then, you added the Editor and it is 3 times as valuable, in my opinion.

I have a book is really chopped up. Four lines in the editor for every book page for just headers. But I fixed those using S&R, no problem. The real problem is when I try to remove a bunch of the html tags to clean it up. You can see what I mean:

Code:
<p class="calibre1">Fallon was shaking his head. “Let me tell you what the people in</p>
<p class="calibre1">Washington say is stenciled on that woman’s undies. ‘Virginia Larue’s</p>
<p class="calibre1">Home for Wayward Boys.’ Ginny Larue is a regular one-woman</p>
As you can see, it is just one character I need to not to remove in most of them.

I used this to find them.
Code:
 \w</p>\s<p.\w+..\w+..
I would like to use just a *space* to replace them.

Obviously, I can't use S&R to fix them without hosing my book. Is there any way that I rewrite the regular expression that won't select the last character just before the closing tag?

Thank you.
one of your faithful lurkers,
larry
timberbeast is offline   Reply With Quote