![]() |
Removing excess carriage returns
I have some old txt files that I'm trying to switch to ebooks.
Many of them have sentences broken by carriage returns. E.g. "The sentence is fine, and in most cases paragraphs are in tact, but perhaps one in every five sentences contains a carriage return in the middle, which is mildly annoying when reading on my Cybook." The common factor is that there's a no punctuation before the carriage return. Is there any way to sort this out? I was thinking perhaps if I could get Calibre to delete any carriage returns that were not preceeded by .!? or ." !" ?" |
The 'common factor' is probably that these misplaced carriage returns are followed by lowercase letters (not necessarily every single time - but mostly).
If you have MSWord or similar, you could try doing a search, or search and replace, for ^13[a-z]. bob |
This bit of python code should work for what you want:
Code:
>>> f = open('test', 'rb+wb') |
Quote:
|
Don't forget to add a space! Or you'll be spell checking for days because you stuck two words together at each join.
It's easy to switch all occurrences of multiple spaces to one space, though, if you happen to double up. So first... Find: Code:
\r\n([a-z])Code:
\s$1Find: Code:
([a-z])\s+Code:
$1\sTry it on a copy first. m a r |
Thanks folks!
|
| All times are GMT -4. The time now is 06:15 PM. |
Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.