02-24-2018, 02:45 AM | #1 |
Junior Member
Posts: 4
Karma: 48
Join Date: Feb 2018
Location: SC
Device: kindle paperwhite
|
Fixing breaks in dialogue
In several my books, there are points where the dialogue is broken with newlines. Below is an HTML code example from an AZW3 formatted book of the thing I'd like to fix:
<p class="calibre1">"Let's just assume that you really do read minds. </p> <p class="calibre1">What on earth makes you believe that I can do the same? </p> <p class="calibre1">I think I would have a much easier time at work if I could read my clients' thoughts." </p> I would like to change it to something like: <p class="calibre1">"Let's just assume that you really do read minds. What on earth makes you believe that I can do the same? I think I would have a much easier time at work if I could read my clients' thoughts." </p> Is there any way to find all these broken dialogues within a file with regex/regex-functions and make them appear whole and uninterrupted? I just started using the whole regex thing and haven't been able to find a way to fix problems like this. |
02-24-2018, 03:11 AM | #2 | |
Addict
Posts: 281
Karma: 7724454
Join Date: Sep 2017
Location: Bethesda, MD, USA
Device: Kobo Aura H20, Kobo Clara HD
|
Quote:
2. Use (?m) for multi-line matches; e.g. search for something like: (?m)(“[^”])</p>[\n]<p class="calibre1"> and replace with: \1 Run that replace a few times until it doesn't match any more. But you probably want to do it by hand (examine each case) unless you're 100% confident all your quotation marks nest properly within each paragraph and the book isn't intentionally using continuing-quotes or the like. |
|
Advert | |
|
02-24-2018, 05:41 AM | #3 | |
Junior Member
Posts: 4
Karma: 48
Join Date: Feb 2018
Location: SC
Device: kindle paperwhite
|
Quote:
Step one was a great idea that I'll have to remember and the comment on continuing-quotes saved me a lot of work as well. Although the expression you supplied did not work for me (probably something I was doing wrong/don't understand yet ) the logic behind it allowed me to put together something that worked with my initial example and the rest of the book I was working on (and I'll be able to edit it for use in other books as well)! If anyone is curious of what I cobbled together: \“.*\”(*SKIP)(*FAIL)|(?<thing>\“.*)</p>\s*<p class="calibre1"> Another question did come up though. In the replace field you said to put a \1. This worked beautifully but if you could explain why/what exactly this (expression?) is doing, I would greatly appreciate it as this was one of the things that was tripping me up (being able to dynamically copy & paste part of something found with desired changes). This has been stumping me for a while now so thanks again! Now I can add another means of efficiently editing books to my slowly growing repertoire! Update Just edited the above expression to account for continuing-quotes: \“.*\”(*SKIP)(*FAIL)|\“.*</p>\s*<p class="calibre1">\“(*SKIP)(*FAIL)|(?<thing>\“.*)</p>\s*<p class="calibre1"> Last edited by nopi1001; 02-24-2018 at 05:51 AM. Reason: updating |
|
02-24-2018, 05:58 AM | #4 |
null operator (he/him)
Posts: 20,565
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
If you can wrangle your books into Word 2007/10/13/16 you could use the Dialogue Checker in Toxaris' excellent e-Book Tools - a Word add-in.
The addin itself can import and export EPUB. Or you could convert to DOCX and then convert the modified DOCX back to EPUB using one of several DOCX->EPUB tools. BR |
02-24-2018, 06:50 AM | #5 | |
Resident Curmudgeon
Posts: 73,957
Karma: 128903250
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
Last edited by DoctorOhh; 02-24-2018 at 07:35 AM. Reason: fixed quote closing tag |
|
Advert | |
|
02-24-2018, 10:36 AM | #6 | |
Book E d i t o r
Posts: 432
Karma: 288184
Join Date: May 2015
Device: Laptop
|
Quote:
I have also been using ctrl-n often to see how many matches there are. Thank you to the poster (in another thread) who posted that! |
|
02-24-2018, 12:07 PM | #7 |
Resident Curmudgeon
Posts: 73,957
Karma: 128903250
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Please don't help until the OP comes back and can prove that the eBook was bought. I took a look at a sample from Amazon and a sample from Kobo and both had no problem. So I don't think this is a legit copy the OP is talking about.
|
02-24-2018, 02:14 PM | #8 | |
Addict
Posts: 281
Karma: 7724454
Join Date: Sep 2017
Location: Bethesda, MD, USA
Device: Kobo Aura H20, Kobo Clara HD
|
Quote:
Consider the sentence “I like Saturday and Sunday, but Samuel prefers sandwiches.” Search for: .*(Sat[^ ]*).*\(san.*\)[.] Replace with: \2 was last, \1 was first Results in: “sandwiches was last, Saturday was first” That's a relatively simple case, there are a lot of other possiblities. http://www.pcre.org/current/doc/html...ern.html#SEC19 http://www.pcre.org/current/doc/html...ern.html#SEC14 |
|
02-24-2018, 02:33 PM | #9 | ||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
I used to use a lot of hackish Regex, but it would always miss hard cases, especially cases of inner/outer quotes. If you still want to use Regex though, as sjfan mentioned, the most important step is to first smarten the punctuation. There's not a reliable way you can fix missing quotations with dumb quotes. For example, this is one I used to use: Search: (“[^”\r\n]*)</p>\s+<p> Replace: \1 (There is a space after that "\1 " in Replace.) That Regex would look for a LEFT double quote in a paragraph without a RIGHT closing quote. Anyway, there was a lot of quotation mark discussion in previous topics which might also help you: https://www.mobileread.com/forums/sh...d.php?t=292818 https://www.mobileread.com/forums/sh...d.php?t=212029 Quote:
The topic is about fixing breaks in dialogue, which is a very common error across all types of ebooks. |
||
02-24-2018, 03:05 PM | #10 |
Wizard
Posts: 1,161
Karma: 1404241
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
|
I really don't understand why people uses PDF files and don't tell it.
And to go forward with this, what did it really help to use a regexp in this case? The only effect is that the user have now an endless flow of words. Also not a really helpful tip. (I know he had ask for it) The first part of the request from JSWolf wasn't stupid if we do think a second about it... Maybe it would be more efficient to wait for the answer and then giving a better solution (e.g. how to do a better conversion of a PDF for this little special problem...) Last edited by Divingduck; 02-24-2018 at 03:07 PM. |
02-24-2018, 03:21 PM | #11 |
Addict
Posts: 281
Karma: 7724454
Join Date: Sep 2017
Location: Bethesda, MD, USA
Device: Kobo Aura H20, Kobo Clara HD
|
Huh? The regex solutions outlined only join the current quote, they don't turn everything into a continuous paragraph.
|
02-24-2018, 04:08 PM | #12 | |
Junior Member
Posts: 4
Karma: 48
Join Date: Feb 2018
Location: SC
Device: kindle paperwhite
|
Quote:
I appreciate everyone's suggestions and am thrilled with all I have learned from this post! Both new sources and new logic to use. |
|
02-24-2018, 04:27 PM | #13 |
Resident Curmudgeon
Posts: 73,957
Karma: 128903250
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
If you are converting from PDF, then the solution to that is don't do it. It's not worth the hassle.
The problem is that if you have split dialog, you have other split lines. The only way to sort of fix some of it is to check for lack of punctuation at the end of the line and combine that with the next line. But the only way to fix it is to find a good PDF or pBook source and s/b compare. That's how it will be fixed. Last edited by JSWolf; 02-24-2018 at 04:30 PM. |
02-24-2018, 04:55 PM | #14 |
Junior Member
Posts: 4
Karma: 48
Join Date: Feb 2018
Location: SC
Device: kindle paperwhite
|
...I explicitly stated in previous posts on this thread that my problem has been solved by sjfan (his solution/logic worked for my problem and with it I have already made headway in other projects) and was further helped along with his later advice and the advice from others. I will continue to convert from PDF and do not appreciate your unhelpful comments.
|
02-24-2018, 05:01 PM | #15 | ||
Addict
Posts: 281
Karma: 7724454
Join Date: Sep 2017
Location: Bethesda, MD, USA
Device: Kobo Aura H20, Kobo Clara HD
|
Quote:
It’s certainly true that you need to do a manual check of things to get everything right. But it’s still worth automating what you can. There are certainly cases like: Quote:
But you still want to automate what you can. It saves a lot of work and reduces error rates. If you take care of 90% cases mechanically, you’re mentally freer as you're reading through and proofing things. It allows you to focus your energy on the cases that really need some thought. And it reduces the chances that you’ll miss something. |
||
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Unquoted dialogue | BetterRed | Workshop | 10 | 08-16-2015 04:16 AM |
Dialogue Questions | jhempel24 | Writers' Corner | 20 | 12-10-2012 06:17 PM |
Writing online dialogue | mr ploppy | Writers' Corner | 7 | 05-02-2011 05:17 PM |
Adding page breaks in Calibre breaks ePubcheck validation | bookraft | Conversion | 16 | 03-01-2011 01:23 PM |
Request Success Dialogue | aidren | enTourage Archive | 0 | 04-19-2010 06:19 PM |