![]() |
#1 |
Junior Member
![]() Posts: 2
Karma: 10
Join Date: Sep 2023
Device: iPhone
|
Using Regex to replace line breaks
I am editing and cleaning up a bunch of Calibre epub conversions and they're messy and full of trash code. Regex has been so helpful to find stuff that regular find and replace can't do efficiently.
I'm still pretty new to using Regex and I was wondering if it can help me find the following: Carriage returns / new lines that are not new lines that end with a tag; then replace them with a blank 'space'. Here's and example of uncorrected text: Code:
<p class="p2">‘Professor!’ It was Vesuvius. She sounded frightened. ‘Professor!’</p> <p class="p2">Sara looked to Robert.</p> <p class="p1"><br/>‘What is it?’ Sara asked. Robert put his arm around her, but she barely seemed to notice.</p> Code:
<p class="p2">‘Professor!’ It was Vesuvius. She sounded frightened. ‘Professor!’</p> <p class="p2">Sara looked to Robert.</p> <p class="p1"><br/>‘What is it?’ Sara asked. Robert put his arm around her, but she barely seemed to notice.</p> |
![]() |
![]() |
![]() |
#2 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,156
Karma: 4567890
Join Date: Nov 2009
Device: many
|
Start with a copy of your epub and try running Sigil's Mend and Prettify tool.
It may do what you want. If not, regex can. |
![]() |
![]() |
![]() |
#3 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 29,101
Karma: 53103620
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
That is NOT a Calibre conversion (or artifact).
<p><br />... Gimmie a proper Scene break or first paragraph ![]() I don't even try for a 1 click repair. After EACH success, I save in case a later fix goes off the rails Once that is done, I fix the lowercase lowercase ones. (This is the only one I use Replace all after a few tests. I replace the Class= with the actual one used in the book The rest I step thru Replace-Next or Find (Skips that one). These are fairly few and take only a few minutes. Then I fix Honorifics (many have a period) Next is Initials <I can't remember if I do the Before or After Honorifics) Then I fix Uppercase (eg Names) Upper-Upper is mostly for acronyms |
![]() |
![]() |
![]() |
#4 | ||
Junior Member
![]() Posts: 2
Karma: 10
Join Date: Sep 2023
Device: iPhone
|
Quote:
Quote:
![]() Thanks for those regex's - they'll come in handy in my future editing! |
||
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Find and replace across line breaks | skb | Sigil | 2 | 02-07-2017 04:08 PM |
Removing Line breaks using regex in PDF when converting | tankervin | Conversion | 3 | 01-12-2017 04:23 PM |
Regex Help: Find page number & Replace+Remove 2x Line Breaks in Sigil | Contre-jour | Sigil | 9 | 02-01-2013 10:47 AM |
Line breaks on Kindle, no line breaks on 4 PC | Siavahda | Kindle Formats | 0 | 10-20-2012 05:50 AM |
Find/Replace bogus line breaks in Text editor, w/Regular Expression | scubaddictions | Conversion | 15 | 07-21-2011 08:52 AM |