View Single Post
Old 07-20-2011, 07:46 PM   #13
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,782
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by scubaddictions View Post
Ok, I understand that part. The problem I'm having though doesn't deal with double line breaks, I'm not trying to remove blank lines or put paragraphs back together. The problem texts I'm dealing with only have single line breaks. Some of them are original and required, some are not from the original text and stuck into the middle of a sentence. For example:

This is not a broken sentence. This sentence, however
is broken in the middle. I'd like to fix it if at all possible.
"Do you want broken sentences?" Bill asked.
Jimmy replied "No, I do not".

This paragraph has only single line breaks, four of them. Three of them are as author intended, breaking up text onto different lines so it doesn't run together. One of them (after the word "however") is not what the author wanted, it was added during some later editing to fit the borders of of some other format.

The Find/Replace searches from my first post fix this by finding lines that either begin with or end with a lower case letter. Seems to work near perfectly. I can't figure out any other way to Find/Replace only the unintended single line breaks.

Ideas? Thanks!
I fix these in a text Editor/Sigil (code view), with a number of passes to carefully get most

Case sensitive mode: set

replaces if line ends in a-z or comma and next starts with a-z (replace all fairly safe after testing )
([a-z,])</p>\s+<p class=.+>([a-z])

Replace: \1 \2

Next pass, I pick up a-z AND closing quotes
([a-z]\")</p>\s+<p class=.+>([a-z])
Replace: \1 \2
fairly safe Replace all

Now it gets iffy, I suggest Finding and selecting Replace Or Skip, rather than Replace All

we are going to repeat the above BUT with the next part beginning with a Capitol letter;
... he looked at</p>
<p>James and winked...

([a-z,])</p>\s+<p class=.+>([A-Z])

and now with quote (be sure to use the type straight or closing curly quote as used within your book)

([a-z]\")</p>\s+<p class=.+>([A-Z])


There may be a few odd ones that you will have to custom deal with by hand. Line ends with abbreviation/initial

Mr.</p>
<p>Jones
theducks is offline   Reply With Quote