09-06-2014, 12:06 PM | #1 |
Couch Potato
Posts: 5
Karma: 10
Join Date: Sep 2012
Device: Kindle Keyboard 3rd Gen.
|
Regular expressions in the replace field
I have several e-books that were converted from PDF to AZW3. A lot of them have sentences that are broken by tags such as:
<p class="calibre1">“Thank you.” Justin made himself comfortable on the love seat </p> <p class="calibre1">and accepted a glass of wine. “I think I’ve just overindulged in too much food while I’ve been here. I don’t get this kind of cooking at home. In fact, I usually skip meals.” </p> Is there a way to use search and replace with regex to merge the broken sentences back together? |
09-06-2014, 12:25 PM | #2 |
Bookish
Posts: 910
Karma: 1803094
Join Date: Jun 2011
Device: PC, t1, t2, t3, aura 2 v1, clara HD, Libra 2, Nxtpaper 11
|
Try using regex:
Code:
Find: </p> <p class="calibre1">([a-z]) Replace: \1 |
Advert | |
|
09-06-2014, 12:35 PM | #3 |
Couch Potato
Posts: 5
Karma: 10
Join Date: Sep 2012
Device: Kindle Keyboard 3rd Gen.
|
Thanks DrChipper, that works. Except that it also finds upper case letters also.
|
09-06-2014, 12:35 PM | #4 |
Well trained by Cats
Posts: 29,839
Karma: 54837878
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Dr C's tip is a great start: I prefer to use </p>\s+<p to allow for code indention variations
There will (always) be exceptions Mr.</p> <p>Jones I have a series of these REGEX patterns I use The one given is my usual first pass (use ALL with care. There are EXCEPTIONS where the lines should NOT be joined) IMHO Always do a bunch of Find, Replace manually before letting the REGEX loose (if at all) to see if there are Gotchas in your pattern |
09-06-2014, 12:39 PM | #5 |
Bookish
Posts: 910
Karma: 1803094
Join Date: Jun 2011
Device: PC, t1, t2, t3, aura 2 v1, clara HD, Libra 2, Nxtpaper 11
|
theducks: Sound advice.
Always test before using on your whole document. Do not forget to check the "case sensitive" check-box. |
Advert | |
|
09-06-2014, 12:44 PM | #6 |
Couch Potato
Posts: 5
Karma: 10
Join Date: Sep 2012
Device: Kindle Keyboard 3rd Gen.
|
Case sensitive. Why didn't I think of that. Thanks a ton.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Regular Expressions Help | lauralein | Library Management | 1 | 11-12-2013 07:05 PM |
Multiple Field Regular Expressions | TheDauthi | Library Management | 2 | 09-14-2012 03:21 AM |
Help with regular expressions | MostlyCarbon | Library Management | 0 | 02-04-2012 03:00 PM |
Regular Expressions | geormes | Calibre | 4 | 08-04-2011 07:09 AM |
Help with Regular Expressions | ghostyjack | Workshop | 2 | 01-08-2010 11:04 AM |