Fixing broken sentences.
Hi, all. I've found out through a few other threads how to fix broken sentences left by conversions from PDF to ePub formats. Currently, I'm using:
Find: ([a-z])</p>\s+<p class="calibre2">
Replace: \1_
(The _ being a space)
I was wondering if there was a way to add something to skip over breaks where the first letter of the second line is a capital?
For example, I'd like to find this:
...blahblah</p>
<p class="calibre2">blahblah...
But not this:
...blahblah</p>
<p class="calibre2">Blahblah...
Basically, this would help me a lot while trying to fix things like scripts or screenplays, or books with multi-line chapter titles, such as:
CHAPTER 6: The Plot Thickens
Ottawa
Any help would be much appreciated. Thanks in advance.
|