Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 07-19-2011, 07:36 AM   #1
scubaddictions
Member
scubaddictions began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Jul 2011
Device: Smartphone
Need Text editor for specific Find/Replace Regular Expression

Hello all,

Originally posted this in a different sub-forum and I think this one is more appropriate.

Here's the thing: a number of the original texts I'm dealing with have hard coded line break/carriage returns that cause broken-up sentences in the final product. Sometimes Calibre deals with it correctly, sometimes not. I've gathered enough information to create a functioning solution but I'm using demo software that will expire in a month. In brief, I'm trying to figure out what freeware text editor will allow me to use the same solution I've worked out on the expensive software that I don't want to purchase. I'm not truly cheap, I'm just certain that I can do this task without needing one particular commercial software.

From this forum post here:

https://www.mobileread.com/forums/showthread.php?t=47044

I learned that it was possible to do a relatively simple Find/Replace function in a text editor to search for a line break followed directly by any lower-case letter of the alphabet as would usually happen if you a place a line break mid-sentence. I was successful using this technique in the recommended text editor (UltraEdit) but of course it costs money. I have a multitude of other free text editors and I believe I should be able to perform the same task in one of them just the same. I have to admit that I only partially understand the syntax of the search parameters so that makes it difficult to translate it directly to another application.

First, what works: Open document in UltraEdit, pull up Replace window. Select Match Case and turn on Regular Expression, choose Perl as Expression Engine.

Find What: \r\n([a-z])

Replace With: \1 <---There is a space before the One. (Space - Backslash - One)

This grabs most instances. For various reasons (capital letters, punctuation) I found that running a second pass using the inverse manages to catch almost all of the other instances, like this:

Find What: ([a-z])\r\n

Replace with: \1 <---There is a space after the one. (Backslash - One - Space)

So, this works like a charm but the Demo expiration on UltraEdit (ver. 17.10.0.1010) will leave me stranded. The same author of this information above recommended a different text editor in addition, TextPad, which I downloaded (ver. 5.4.2) In addition, I have access to NotePad++ (ver. 5.9.2) , Open Office (ver. 3.2.1), along with Window's Wordpad and Notepad. With the possible exception of Open Office and the built-in Windows stuff the rest are all recent downloads and should be the newest available.

I've tried so many different versions of this syntax in the other text editors I already have, with no real success. It seems to be partially a problem with the different ways a text editor can view search perameters, as Normal Text, as Extended characters or as Regular Expression. Each has it's own version of a line break (^13 or ^p, \r\n, $, etc.) and I'm reading websites that reference all of those and more. None of the other text editors accept the exact syntax as I've outlined above. It either erases characters that it shouldn't, pastes in characters that I don't want or just leaves the extra line breaks intact. I think I've hit a brick wall and need help from people more experienced that I, and here I am. Can anybody help me?

Thanks!

Ryan
scubaddictions is offline   Reply With Quote
Old 07-19-2011, 04:05 PM   #2
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
I use NotePad++, it open source and supports RegEx quite well.
Toxaris is offline   Reply With Quote
Old 07-19-2011, 05:01 PM   #3
BobC
Guru
BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.
 
Posts: 691
Karma: 3026110
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro +
I had to do some similar text massaging recently and used OpenOfficeOrg Writer with Tom Bilek's AltSearch extension - it allows you to save complex search and replace expressions for re-use.

I also have OOOFBTools extension installed - it provides a number of useful line manipulation macros even if you don't want to export the final file as FB2.

BobC
BobC is offline   Reply With Quote
Old 07-19-2011, 06:39 PM   #4
Terisa de morgan
Grand Sorcerer
Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.
 
Terisa de morgan's Avatar
 
Posts: 6,233
Karma: 11768331
Join Date: Jun 2009
Location: Madrid, Spain
Device: Kobo Clara/Aura One/Forma,XiaoMI 5, iPad, Huawei MediaPad, YotaPhone 2
I'm a bit... archaic, you can say. I use gvim (vi for PC, free) and you can do it without problem.
Terisa de morgan is offline   Reply With Quote
Old 07-20-2011, 09:54 AM   #5
scubaddictions
Member
scubaddictions began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Jul 2011
Device: Smartphone
Quote:
Originally Posted by Terisa de morgan View Post
I'm a bit... archaic, you can say. I use gvim (vi for PC, free) and you can do it without problem.
I'm sure it can, but I don't really want to jump into the subtle intricacies of VI just to do this one simple task. I already know that this works in at least one Windows based text editor and being that it's based on standard Regex stuff there is no good reason why I can't get it to work in another Windows editor. I just need help getting the syntax right. Any help on this front would be greatly appreciated. Thanks!

Ryan
scubaddictions is offline   Reply With Quote
Old 07-20-2011, 10:15 AM   #6
scubaddictions
Member
scubaddictions began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Jul 2011
Device: Smartphone
Quote:
Originally Posted by Toxaris View Post
I use NotePad++, it open source and supports RegEx quite well.
I've got it as well, (and I like it!) but the syntax does not port directly from UltraEdit to Notepad++. It should all be just Regex. Any help on figuring out the syntax to make NotePad++ work would be greatly appreciated.

Thanks!

Ryan
scubaddictions is offline   Reply With Quote
Old 07-20-2011, 10:22 AM   #7
Terisa de morgan
Grand Sorcerer
Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.
 
Terisa de morgan's Avatar
 
Posts: 6,233
Karma: 11768331
Join Date: Jun 2009
Location: Madrid, Spain
Device: Kobo Clara/Aura One/Forma,XiaoMI 5, iPad, Huawei MediaPad, YotaPhone 2
I'll try to see if it works at Windows vi (gvim )
Terisa de morgan is offline   Reply With Quote
Old 07-20-2011, 10:23 AM   #8
scubaddictions
Member
scubaddictions began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Jul 2011
Device: Smartphone
Quote:
Originally Posted by scubaddictions View Post
...I don't really want to jump into the subtle intricacies of VI just to do this one simple task. I already know that this works in at least one Windows based text editor...
Errrr, I mispoke there. I didn't mean to imply that you can't use a VI clone in a Windows environment. I guess to be more accurate: I have a functioning solution in a commercial WYSIWYG text editor, and I'd like to find how to port that solution into a freeware WYSIWYG text editor. Learning VI doesn't sound like heaps of fun. Other than accomplishing this one simple task I just won't use any new-found knowledge of VI for anything else. Seriously!
scubaddictions is offline   Reply With Quote
Old 07-20-2011, 10:25 AM   #9
scubaddictions
Member
scubaddictions began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Jul 2011
Device: Smartphone
Quote:
Originally Posted by Terisa de morgan View Post
I'll try to see if it works at Windows vi (gvim )
If it isn't too much trouble (and you can come up with simple "Cliff Notes" how-to instructions) that would be awesome!
scubaddictions is offline   Reply With Quote
Old 07-31-2011, 01:21 PM   #10
JaneD
Addict
JaneD ought to be getting tired of karma fortunes by now.JaneD ought to be getting tired of karma fortunes by now.JaneD ought to be getting tired of karma fortunes by now.JaneD ought to be getting tired of karma fortunes by now.JaneD ought to be getting tired of karma fortunes by now.JaneD ought to be getting tired of karma fortunes by now.JaneD ought to be getting tired of karma fortunes by now.JaneD ought to be getting tired of karma fortunes by now.JaneD ought to be getting tired of karma fortunes by now.JaneD ought to be getting tired of karma fortunes by now.JaneD ought to be getting tired of karma fortunes by now.
 
JaneD's Avatar
 
Posts: 322
Karma: 1057749
Join Date: May 2010
Location: LA, CA
Device: Kindle Paperwhite 2013
I've found a solution for this in TextWrangler. Paste original text into a new doc in TextWrangler, choose "Soft Wrap Text" then "Remove Line Breaks." Works like a charm. Also TextWrangler is free.
JaneD is offline   Reply With Quote
Old 08-17-2011, 03:47 PM   #11
NeilPet
Junior Member
NeilPet began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Apr 2011
Location: Calgary, Alberta
Device: iPad
TextWrangler has a builtin regEx search and replace function. If you use the Find/Replace function, check off the 'grep' checkbox and what you type into the search/replace fields is parsed as regex. Of course, each regex tool has a slightly different syntax, so you might have to modify your parameters to get the correct result in TextWrangler.

TextWrangler's help file has a GREP reference, to help.
NeilPet is offline   Reply With Quote
Old 08-17-2011, 07:58 PM   #12
dgillette.rm
Connoisseur
dgillette.rm knows the difference between 'who' and 'whom'dgillette.rm knows the difference between 'who' and 'whom'dgillette.rm knows the difference between 'who' and 'whom'dgillette.rm knows the difference between 'who' and 'whom'dgillette.rm knows the difference between 'who' and 'whom'dgillette.rm knows the difference between 'who' and 'whom'dgillette.rm knows the difference between 'who' and 'whom'dgillette.rm knows the difference between 'who' and 'whom'dgillette.rm knows the difference between 'who' and 'whom'dgillette.rm knows the difference between 'who' and 'whom'dgillette.rm knows the difference between 'who' and 'whom'
 
dgillette.rm's Avatar
 
Posts: 95
Karma: 10072
Join Date: Apr 2008
Device: sony
I have not tried your specific problem, but Jedit should handle this. It is written in Java and is cross platform. I have used the product both on a Mac and a Windows machine.
dgillette.rm is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Help with regular expression search/replace bfollowell Sigil 12 06-20-2013 07:36 PM
Regular Expression Help Azhad Calibre 86 09-27-2011 02:37 PM
Find/Replace bogus line breaks in Text editor, w/Regular Expression scubaddictions Conversion 15 07-21-2011 08:52 AM
Search & Replace - Regular expression oldbwl Calibre 2 01-09-2011 09:33 AM
Find/Replace with regular expression hydrolith Sigil 6 03-01-2010 08:42 PM


All times are GMT -4. The time now is 07:55 AM.


MobileRead.com is a privately owned, operated and funded community.