![]() |
#1 |
Connoisseur
![]() Posts: 63
Karma: 10
Join Date: Jul 2011
Device: Sony Touch, Nook Simple Touch, Kobo Aura, Android w/CoolReader
|
Replacing Periods with Commas
Greetings!
I've got an epub that started life as a pdf, I think. It has, with some frequency, occurrences in which a period appears where a comma should be. Aside from being able to tell contextually there is the added bonus that the first letter of the next "sentence" is in lower case. I have identified that I can search for these instances using Code:
[.] [a-z] However, if I try to use Code:
[,] [a-z] What am I doing wrong, specifically? Obviously my regex sucks, I'm just not sure how to fix it. |
![]() |
![]() |
![]() |
#2 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
You were off to a FANTASTIC start (especially if you are new to regex).
Also, take a look through the Regex topic in the Sigil forum, there is A TON of helpful Regex in there: https://www.mobileread.com/forums/sho...d.php?t=167971 A period in Regex stands for "any character", so in order to catch an ACTUAL period, you have to escape it with a slash '\'. Search: Code:
\. ([a-z])
Code:
, \1
So, this regex in English says: "Search for a period, then a space, and capture the "lowercase a through z" and stick it in \1". "Replace with a comma, space, and then whatever lowercase a-z was captured in \1". Some more complex Regexes might have you capturing a lot more things, and then you would be able to use \2, \3, \4, ... Quote:
![]() Is this a public domain work? After you are done cleaning it up, you should post it on MobileRead! Last edited by Tex2002ans; 12-15-2013 at 03:42 PM. |
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Connoisseur
![]() Posts: 63
Karma: 10
Join Date: Jul 2011
Device: Sony Touch, Nook Simple Touch, Kobo Aura, Android w/CoolReader
|
Holy smokes! Not only have you helped me with the correct and properly working regex but you've given me one of the clearest and easiest to understand explanations for regex functions I've ever read! I actually learned several things in your explanation above and beyond the bit of code I was searching for.
THANK YOU! ![]() I'm afraid the book is not public domain or I would share. It's not in terrible condition but there are places where it is obvious it was a pdf or a scan in a former life. Wow. I really appreciate your help. |
![]() |
![]() |
![]() |
#4 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,908
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
And don't be fooled. This is 100% applicable to Sigil
https://www.mobileread.com/forums/sho...d.php?t=118570 (This is the one that beat REGEX into my thick skull) |
![]() |
![]() |
![]() |
#5 | |||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
If you need more in-depth explanations of any in the Regex topic as well, feel free to ask (I can see if I can figure it out and color code + give "in English" explanations). I seem to be one of the few who color codes the Regex explanations... I find that to be much more understandable (although it takes a little longer to layout/plan/post). Especially when dealing with capture points, it is nice to see the colors matching in input/output. ![]() Always remember though to save a copy of the EPUB before you run any regex, and do not ever use REPLACE ALL unless you have been using for the Regex for a long time and know EXACTLY what it will be doing. Regex is extremely powerful, and it is very easy to mess up (even the best of us sometimes make typos, so I always do a few replace/undo, replace/undo, just to make sure that it is doing what I want). ![]() Also helpful is the Sigil Saved Search feature: http://web.sigil.googlecode.com/git/..._searches.html Which allows you to save a list of Searches/Regex, and allows you to easily load/run them. Quote:
Quote:
http://www.regular-expressions.info/tutorial.html Nowadays, I mostly just come up with the Regex off the top of my head according to patterns of errors that I recognize while looking through a book. Last edited by Tex2002ans; 12-15-2013 at 09:51 PM. |
|||
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
As far as the global replace does, you may wish to consider moving over to calibre's new ebook-edit feature. As Sigil is no longer being maintained, it is rather difficult to get new features, whereas Kovid is very active in maintaining calibre.
The important thing to consider is this: from the beginning calibre ebook-edit has had a very cool feature, it is called global undo-redo. Every time a global action is done, a checkpoint is created allowing you to roll back those changes. And you can manually create checkpoints too. It is not fully done yet, but the editor has had all the basic functionality added already, and soon we will have saved searches, clips, a formatting toolbar, and all kinds of Sigil goodies! As soon as Kovid gets a chance to code them. He started work less than 2 months ago, so what we have already is huge, considering. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Removing periods in Authors | dKodiak | Library Management | 2 | 09-14-2013 01:28 PM |
Replacing code without replacing text? | ElMiko | Sigil | 6 | 11-30-2011 08:14 PM |
Add a space after periods | prhammer | Conversion | 3 | 05-13-2011 06:51 AM |
Seriously thoughtful What about commas? | GraceKrispy | Lounge | 115 | 10-18-2010 10:19 PM |
Commas in LRF metadata | kevin_boone | Calibre | 22 | 02-12-2009 01:39 PM |