Hello all,
I'm trying to work within Calibre primarily but occasionally the formatting fails a bit and I try to drop back and fix the original text files manually. I'm working off some info from an older post on this forum and so this post is more about pre-Calibre editing.
Here's the thing: a number of the original texts I'm dealing with have hard coded line break/carriage returns that cause broken-up sentences in the final product. I've gathered enough information to create a functioning solution but I'm using demo software that will expire in a month. In brief, I'm trying to figure out what freeware text editor will allow me to use the same solution I've worked out on the expensive software that I don't want to purchase. I'm not truly cheap, I'm just certain that I can do this task without needing one particular commercial software.
From this forum post here:
https://www.mobileread.com/forums/showthread.php?t=47044
I learned that it was possible to do a relatively simple Find/Replace function in a text editor to search for a line break followed directly by any lower-case letter of the alphabet as would usually happen if you a place a line break mid-sentence. I was successful using this technique in the recommended text editor (UltraEdit) but of course it costs money. I have a multitude of other free text editors and I believe I should be able to perform the same task in one of them just the same. I have to admit that I only partially understand the syntax of the search parameters so that makes it difficult to translate it directly to another application.
First, what works: Open document in UltraEdit, pull up Replace window. Select Match Case and turn on Regular Expression, choose Perl as Expression Engine.
Find What: \r\n([a-z])
Replace With: \1 <---There is a space before the One. (Space - Backslash - One)
This grabs most instances. For various reasons (capital letters, punctuation) I found that running a second pass using the inverse manages to catch almost all of the other instances, like this:
Find What: ([a-z])\r\n
Replace with: \1 <---There is a space after the one. (Backslash - One - Space)
So, this works like a charm but the Demo expiration on UltraEdit (ver. 17.10.0.1010) will leave me stranded. The same author of this information above recommended a different text editor in addition, TextPad, which I downloaded (ver. 5.4.2) In addition, I have access to NotePad++ (ver. 5.9.2) , Open Office (ver. 3.2.1), along with Window's Wordpad and Notepad. With the possible exception of Open Office and the built-in Windows stuff the rest are all recent downloads and should be the newest available.
I've tried so many different versions of this syntax in the other text editors available to me, with no real success. It seems to be partially a problem with the different ways a text editor can view search perameters, as Normal Text, as Extended characters or as Regular Expression. Each has it's own version of a line break (^13 or ^p, \r\n, and $) and I'm reading websites that reference all of those and more. None of the other text editors accept the exact syntax as I've outlined above. It either erases characters that it shouldn't, pastes in characters that I don't want or just leaves the extra line breaks intact. I think I've hit a brick wall and need help from people more experienced that I, and here I am. Can anybody help me?
Thanks!
Ryan