Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 09-06-2014, 12:06 PM   #1
markallanson
Couch Potato
markallanson began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Sep 2012
Device: Kindle Keyboard 3rd Gen.
Question Regular expressions in the replace field

I have several e-books that were converted from PDF to AZW3. A lot of them have sentences that are broken by tags such as:

<p class="calibre1">“Thank you.” Justin made himself comfortable on the love seat </p>

<p class="calibre1">and accepted a glass of wine. “I think I’ve just overindulged in too much food while I’ve been here. I don’t get this kind of cooking at home. In fact, I usually skip meals.” </p>

Is there a way to use search and replace with regex to merge the broken sentences back together?

markallanson is offline   Reply With Quote
Old 09-06-2014, 12:25 PM   #2
DrChiper
Bookish
DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.
 
DrChiper's Avatar
 
Posts: 910
Karma: 1803094
Join Date: Jun 2011
Device: PC, t1, t2, t3, aura 2 v1, clara HD, Libra 2, Nxtpaper 11
Try using regex:
Code:
Find: </p> <p class="calibre1">([a-z])
Replace: \1
Might do the trick
DrChiper is offline   Reply With Quote
Advert
Old 09-06-2014, 12:35 PM   #3
markallanson
Couch Potato
markallanson began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Sep 2012
Device: Kindle Keyboard 3rd Gen.
Thanks DrChipper, that works. Except that it also finds upper case letters also.
markallanson is offline   Reply With Quote
Old 09-06-2014, 12:35 PM   #4
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,839
Karma: 54837878
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Dr C's tip is a great start: I prefer to use </p>\s+<p to allow for code indention variations

There will (always) be exceptions

Mr.</p>
<p>Jones


I have a series of these REGEX patterns I use

The one given is my usual first pass (use ALL with care. There are EXCEPTIONS where the lines should NOT be joined)

IMHO Always do a bunch of Find, Replace manually before letting the REGEX loose (if at all) to see if there are Gotchas in your pattern
theducks is offline   Reply With Quote
Old 09-06-2014, 12:39 PM   #5
DrChiper
Bookish
DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.
 
DrChiper's Avatar
 
Posts: 910
Karma: 1803094
Join Date: Jun 2011
Device: PC, t1, t2, t3, aura 2 v1, clara HD, Libra 2, Nxtpaper 11
theducks: Sound advice.
Always test before using on your whole document. Do not forget to check the "case sensitive" check-box.
DrChiper is offline   Reply With Quote
Advert
Old 09-06-2014, 12:44 PM   #6
markallanson
Couch Potato
markallanson began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Sep 2012
Device: Kindle Keyboard 3rd Gen.
Case sensitive. Why didn't I think of that. Thanks a ton.
markallanson is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular Expressions Help lauralein Library Management 1 11-12-2013 07:05 PM
Multiple Field Regular Expressions TheDauthi Library Management 2 09-14-2012 03:21 AM
Help with regular expressions MostlyCarbon Library Management 0 02-04-2012 03:00 PM
Regular Expressions geormes Calibre 4 08-04-2011 07:09 AM
Help with Regular Expressions ghostyjack Workshop 2 01-08-2010 11:04 AM


All times are GMT -4. The time now is 01:15 AM.


MobileRead.com is a privately owned, operated and funded community.