Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 05-07-2014, 01:55 AM   #1
timberbeast
stumblebum
timberbeast began at the beginning.
 
timberbeast's Avatar
 
Posts: 29
Karma: 10
Join Date: Nov 2013
Location: Roseburg, OR
Device: kindle2
Thumbs up A little help with a regex please, if you don't mind?

First things first. Thank you very much, Kovid, for your top notch program. Calibre is powerful as heck, and a lot of fun to use. Then, you added the Editor and it is 3 times as valuable, in my opinion.

I have a book is really chopped up. Four lines in the editor for every book page for just headers. But I fixed those using S&R, no problem. The real problem is when I try to remove a bunch of the html tags to clean it up. You can see what I mean:

Code:
<p class="calibre1">Fallon was shaking his head. “Let me tell you what the people in</p>
<p class="calibre1">Washington say is stenciled on that woman’s undies. ‘Virginia Larue’s</p>
<p class="calibre1">Home for Wayward Boys.’ Ginny Larue is a regular one-woman</p>
As you can see, it is just one character I need to not to remove in most of them.

I used this to find them.
Code:
 \w</p>\s<p.\w+..\w+..
I would like to use just a *space* to replace them.

Obviously, I can't use S&R to fix them without hosing my book. Is there any way that I rewrite the regular expression that won't select the last character just before the closing tag?

Thank you.
one of your faithful lurkers,
larry
timberbeast is offline   Reply With Quote
Old 05-07-2014, 02:03 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 34,423
Karma: 10323934
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
</p>\s+<p[^>]*>

This will remove all closing </p> tags followed by opening <p tags. You probably dont want to run this on the entire book, so use the marked region to do it.
kovidgoyal is offline   Reply With Quote
Old 05-07-2014, 02:51 AM   #3
timberbeast
stumblebum
timberbeast began at the beginning.
 
timberbeast's Avatar
 
Posts: 29
Karma: 10
Join Date: Nov 2013
Location: Roseburg, OR
Device: kindle2
Thank you Kovid. That works well. A lot more elegant than mine for sure.

Maybe I'm dreaming, but I was hoping to find something that would ignore the whole sentences with a (.) or a (") at the closing tag.

That is why I started my searches with (\w</p>). It ignores the sentences with ending punctuation. But it picks up the ending character, which I could do without.

There are 9655 reasons for my grasping at straws. I am at line 1282 by hand, so far. If I didn't like the book so well, I would have bagged it a long time ago.

thanks,
larry
timberbeast is offline   Reply With Quote
Old 05-07-2014, 02:58 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 34,423
Karma: 10323934
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
(?<![".])</p>\s+<p[^>]*>
kovidgoyal is offline   Reply With Quote
Old 05-07-2014, 03:13 AM   #5
timberbeast
stumblebum
timberbeast began at the beginning.
 
timberbeast's Avatar
 
Posts: 29
Karma: 10
Join Date: Nov 2013
Location: Roseburg, OR
Device: kindle2
Thank you, thank you, Kovid! You ARE "the man"! I saved it as "Kovid's". It won't take long to fix now. Tomorrow. Woops. It's tomorrow here. Too late for this old fart. Have a good day.

larry

Last edited by timberbeast; 05-07-2014 at 03:15 AM. Reason: punctuation :)
timberbeast is offline   Reply With Quote
Old 05-07-2014, 03:18 AM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 34,423
Karma: 10323934
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You're welcome And if you wish to learn how it works read up on negative lookbehind assertions here: https://docs.python.org/2.7/library/re.html
kovidgoyal is offline   Reply With Quote
Old 05-07-2014, 09:10 AM   #7
timberbeast
stumblebum
timberbeast began at the beginning.
 
timberbeast's Avatar
 
Posts: 29
Karma: 10
Join Date: Nov 2013
Location: Roseburg, OR
Device: kindle2
I'm on it. Thanks again, Kovid.

I went and added hypens to the newest re too. It picks up orphaned hyphens from the chop job too.

The 'Replace and Find' button makes quick work of that mess now. I'm a happy camper.

cheers,
larry
timberbeast is offline   Reply With Quote
Old 05-08-2014, 08:05 AM   #8
user743
Addict
user743 has never been to obedience school.user743 has never been to obedience school.user743 has never been to obedience school.user743 has never been to obedience school.user743 has never been to obedience school.user743 has never been to obedience school.user743 has never been to obedience school.user743 has never been to obedience school.user743 has never been to obedience school.user743 has never been to obedience school.user743 has never been to obedience school.
 
Posts: 243
Karma: 44444
Join Date: Mar 2014
Device: Kindle PW2 special offers removed by Amazon for FREE
I'm just warning you, that one day you wont be able to distinguish one saved search from another. DONT USE THAT NAMING CONVENTION IT'S DANGEROUS.

Thanks kovidgoyal for making this post possible.

Last edited by user743; 05-08-2014 at 08:08 AM.
user743 is offline   Reply With Quote
Old 05-08-2014, 08:28 PM   #9
timberbeast
stumblebum
timberbeast began at the beginning.
 
timberbeast's Avatar
 
Posts: 29
Karma: 10
Join Date: Nov 2013
Location: Roseburg, OR
Device: kindle2
Quote:
I'm just warning you, that one day you wont be able to distinguish one saved search from another. DONT USE THAT NAMING CONVENTION IT'S DANGEROUS.
Speaking from experience?

Nah, by the time I forget those, I'll be able to write my own, and won't need 'em.

I'm fascinated by regexs, and what you can do with them. Especially the lookaheads and the lookbehinds. I'm slooowly figuring how they work, and which to use, and when.

[As a side note:] I had downloaded several applications to test regexs as learning tools. At least the Linux ones I could find.

Then it came to me as I was using the Edit Book feature in Calibre. I told my self, "Self, you are a dumbass. One of the best testing applications in the world, for what you do, is right in front of your face! You don't need any of those others."

Cheers,
larry
timberbeast is offline   Reply With Quote
Old 05-08-2014, 08:50 PM   #10
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,278
Karma: 83106403
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Another resource for learning regex, which is more readable IMHO: http://www.regular-expressions.info
eschwartz is offline   Reply With Quote
Old 05-08-2014, 09:03 PM   #11
timberbeast
stumblebum
timberbeast began at the beginning.
 
timberbeast's Avatar
 
Posts: 29
Karma: 10
Join Date: Nov 2013
Location: Roseburg, OR
Device: kindle2
Thanks eschwartz, I'll check it out. Any thing I can get my grubby paws on about regexs is highly appreciated.

larry
timberbeast is offline   Reply With Quote
Reply

Tags
editor, html tags, regex

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
"Consumers Don't Mind...Ads." Really? L.J. Sellers General Discussions 116 05-29-2011 06:39 AM
Which Ones DON'T have an internet browser? Don't require a bookstore account? emellaich Which one should I buy? 10 01-28-2011 02:50 AM
$99? - Never mind that Zach Sony Reader 315 10-18-2007 07:11 AM


All times are GMT -4. The time now is 06:21 PM.


MobileRead.com is a privately owned, operated and funded community.