12-05-2013, 12:43 PM | #1 |
Enthusiast
Posts: 30
Karma: 1000
Join Date: Nov 2012
Device: none
|
Closing unwanted line breaks
Hello,
I've looked in the other threads about fixing line breaks using Sigil but can't find the solution to my problem: I trying to fix an ebook (a hated Calibre conversion from PDF, I think) in which almost every sentence, though not all -- that contains an em dash is broken after the dash. It looks like this in code view: blah blah blah —</p> <p class="calibre2">blah blah blah In some instances, the code looks like this: blah blah blah —</span></p> <p class="calibre2">blah blah blah I've tried a lot of the regex search strings suggested in other threads but none of them finds any matches, either when I click Find or Count All. Any ideas why the search strings aren't finding any matches? I would greatly appreciate any help! Thanks! |
12-05-2013, 01:19 PM | #2 | |
Well trained by Cats
Posts: 29,691
Karma: 54369090
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
Was the scope: All HTML Sigil comes with a few saved searches: Join is one of them...But only for simple joins. Both the cases you show need care (Replace All is NOT a good idea) The first may be desired (my regex for THAT ONE only allow IF the first letter being joined to is lower case) Your second: What to do about the /span(s)? Things can get uglier/not better FAST with a Replace All |
|
Advert | |
|
12-05-2013, 01:38 PM | #3 | |
Enthusiast
Posts: 30
Karma: 1000
Join Date: Nov 2012
Device: none
|
Quote:
I'm totally puzzled why Sigil isn't finding any matches for my search strings. Could you please post your REGEX string for the first fix? Thanks! |
|
12-05-2013, 02:07 PM | #4 | |
Grand Sorcerer
Posts: 27,468
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
However, if I had to guess, I'd say that whatever search expression you're using isn't taking into account the line breaks/white-space between the two lines of code you're trying to match. Given that the example you gave is consistent through the code, an expression to find what you're looking for (the one without the </span> involved) could be as simple as: (make sure you're actually typing the emdash character here) Code:
—</p>\s+<p class="calibre2"> |
|
12-05-2013, 02:55 PM | #5 | |
Enthusiast
Posts: 30
Karma: 1000
Join Date: Nov 2012
Device: none
|
Quote:
Find: ([a-z])</p>\s+<p class="calibre2"> Replace: \1 In the replace expression, it is \1 followed by a single space. |
|
Advert | |
|
12-05-2013, 03:51 PM | #6 | |
Enthusiast
Posts: 30
Karma: 1000
Join Date: Nov 2012
Device: none
|
Quote:
Does the Sigil manual have a section on how to write code for regex? I looked in the manual trying to solve this problem but it kinda made my head spin. Now, about those other code items with </span>? |
|
12-05-2013, 04:36 PM | #7 | ||
Grand Sorcerer
Posts: 27,468
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
Quote:
Code:
—</span></p>\s+<p class="calibre2"> |
||
12-06-2013, 08:04 AM | #8 | |
Enthusiast
Posts: 30
Karma: 1000
Join Date: Nov 2012
Device: none
|
Quote:
|
|
12-06-2013, 08:20 AM | #9 |
Grand Sorcerer
Posts: 27,468
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
No problem. Good luck on your regex journey!
|
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Help! How do I remove unwanted paragraph breaks? | ElMiko | Sigil | 7 | 03-27-2013 11:43 AM |
Line breaks on Kindle, no line breaks on 4 PC | Siavahda | Kindle Formats | 0 | 10-20-2012 05:50 AM |
Closing up line endings without punctuation | remltr | Conversion | 2 | 06-23-2011 12:25 AM |
Calibre making unwanted chapter breaks | PatNY | Calibre | 6 | 10-08-2010 09:58 PM |
utility to eliminate unwanted line breaks in txt | profnachos | Workshop | 11 | 11-27-2007 06:24 PM |