View Single Post
Old 02-24-2018, 03:11 AM   #2
sjfan
Addict
sjfan ought to be getting tired of karma fortunes by now.sjfan ought to be getting tired of karma fortunes by now.sjfan ought to be getting tired of karma fortunes by now.sjfan ought to be getting tired of karma fortunes by now.sjfan ought to be getting tired of karma fortunes by now.sjfan ought to be getting tired of karma fortunes by now.sjfan ought to be getting tired of karma fortunes by now.sjfan ought to be getting tired of karma fortunes by now.sjfan ought to be getting tired of karma fortunes by now.sjfan ought to be getting tired of karma fortunes by now.sjfan ought to be getting tired of karma fortunes by now.
 
Posts: 281
Karma: 7724454
Join Date: Sep 2017
Location: Bethesda, MD, USA
Device: Kobo Aura H20, Kobo Clara HD
Quote:
Originally Posted by nopi1001 View Post
In several my books, there are points where the dialogue is broken with newlines. Below is an HTML code example from an AZW3 formatted book of the thing I'd like to fix:

<p class="calibre1">"Let's just assume that you really do read minds. </p>
<p class="calibre1">What on earth makes you believe that I can do the same? </p>
<p class="calibre1">I think I would have a much easier time at work if I could read my clients' thoughts." </p>

I would like to change it to something like:

<p class="calibre1">"Let's just assume that you really do read minds. What on earth makes you believe that I can do the same? I think I would have a much easier time at work if I could read my clients' thoughts." </p>

Is there any way to find all these broken dialogues within a file with regex/regex-functions and make them appear whole and uninterrupted? I just started using the whole regex thing and haven't been able to find a way to fix problems like this.
1. Smarten the punctuation before you try this; it makes the matching a lot easier, since “ and ” are used in the main text while " straight quotes are used inside tags.
2. Use (?m) for multi-line matches; e.g. search for something like:
(?m)(“[^”])</p>[\n]<p class="calibre1">
and replace with:
\1

Run that replace a few times until it doesn't match any more.

But you probably want to do it by hand (examine each case) unless you're 100% confident all your quotation marks nest properly within each paragraph and the book isn't intentionally using continuing-quotes or the like.
sjfan is offline   Reply With Quote