Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 01-09-2015, 05:01 AM   #1
1v4n0
Groupie
1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.
 
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
Replacing with regex

EDIT: Solved

Let's say I want to replace all the occurrences of dash+digit, without a space, with dash space digit.

I can use a regex to find what I look for (a very simple one: -[:alpha:], for example), but I don't know how to automatically insert a space. The issue of course is somehow "saving" that digit each time, "copying" it, and pasting it after inserting the space.

I got a vague notion this could be done with javascript, but is there a simpler way?

ty

EDIT: Issue had already been solved, here. If I put the regex between (brackets) it saves the value it finds, which then I can paste by writing \1 in the "replace" field. So cool And it works outside sigil as well (I just tried it w n++)

Last edited by 1v4n0; 01-09-2015 at 05:13 AM.
1v4n0 is offline   Reply With Quote
Old 01-09-2015, 05:43 AM   #2
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by 1v4n0 View Post
If I put the regex between (brackets) it saves the value it finds, which then I can paste by writing \1 in the "replace" field.
That isn't "brackets" it is "parenthesis."

Quote:
Originally Posted by 1v4n0 View Post
EDIT: Issue had already been solved, here.
Also, great job with the edit. So often you see others say something along the lines of "found the answer," and they never post the solution! So when others stumble upon this topic, they have no clue!

If you are just starting out with Regex, I would recommend looking through the Sticky Regex topic here for lots of examples:

https://www.mobileread.com/forums/sho...d.php?t=167971

Anyway, this one is as easy as pie. What I would typically do is this:

Search: -([0-9])
Replace: - \1

May I ask exactly what the use case is for adding a space between a dash and a number?

The only thing I can think of with dashes and numbers is swapping hyphen -> en dash.... although maybe my mind is just too far up proper typography field.

For example, I use this one all the time:

Search: ([0-9])-([0-9])
Replace: \1–\2

So what this does is replaces "number" + "hyphen" + "number" with "number" + "en dash" + "number".

This is the typographically correct way when dealing with years, or page numbers:

1910–1930
pp. 320–325

Last edited by Tex2002ans; 01-09-2015 at 05:45 AM.
Tex2002ans is offline   Reply With Quote
Advert
Old 01-09-2015, 02:16 PM   #3
1v4n0
Groupie
1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.
 
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
Quote:
Originally Posted by Tex2002ans View Post
That isn't "brackets" it is "parenthesis."
Hmm what? What's the difference?

Quote:
Originally Posted by Tex2002ans View Post
Also, great job with the edit.


Quote:
Originally Posted by Tex2002ans View Post
May I ask exactly what the use case is for adding a space between a dash and a number?
My bad. I meant "typeface", and anyway I need it mostly with letters - for the asides - because sometimes OCR erases the space between the dash and the following word.

Whatever. I like this place
1v4n0 is offline   Reply With Quote
Old 01-09-2015, 02:49 PM   #4
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,799
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by 1v4n0 View Post
Hmm what? What's the difference?
To REGEX
LOTS

( group
[ set
{ range
theducks is online now   Reply With Quote
Old 01-09-2015, 03:25 PM   #5
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by 1v4n0 View Post
Hmm what? What's the difference?
As theducks said. Especially in Regex, all three symbols mean VERY different things:
  • ( = parenthesis
    • This is used to capture things. You can then use \1, \2, \3 in order to replace the captured parts.
    • For example:
      • Search: (a)(b)
      • Replace: \2\1
      • This would switch "ab" to "ba"
  • [ = bracket
    • This is used to specify a range of characters.
    • For example:
      • Search: [0-5]
      • This would search for every number that is 0 THROUGH 5: 0, 1, 2, 3, 4, 5
      • Search: [b-e]
      • This would search for every letter that is b THROUGH e: b, c, d, e
  • { = braces or curly bracket or "squiggly bracket"
    • This is used to specify amounts. "How many are we looking for?"
    • For example:
      • Search: [0-9]{4}
      • This would search for ONLY 4 numbers in a row.
      • This one is EXTREMELY helpful for spotting things like years.
      • Search: [0-9]{5,}
      • This would search for 5 OR MORE numbers in a row.
      • I use this one all the time to catch OCR mistakes with years, when the hyphen didn't OCR correctly: "19421945" -> "1942–1945"

Quote:
Originally Posted by 1v4n0 View Post
[...] anyway I need it mostly with letters - for the asides - because sometimes OCR erases the space between the dash and the following word.
I personally tend to favor using the Spell Check tool as an alternate way to catch these hyphenation issues.

Click image for larger version

Name:	SpellcheckHyphen.png
Views:	295
Size:	12.4 KB
ID:	133599

In the "Filter" box, I just stick a hyphen. This will show you every single word with a hyphen in it. Then you can quickly scan the list and see if you spot any oddities. For example, things like "-the" or "-and" will almost NEVER be correct. So when you have a sentence like this:

Quote:
This is a sample sentence-this is an aside-and this is the continuation of the sentence.
In the spellcheck, you will see "sentence-this" and "aside-and". You can then investigate much more closely.

Depending on how many hyphens you have in your book, that could be another way to quickly go through and fix those types of errors.

Edit: Oh wait, I think I see what you mean now. You are talking about "open spacing" around dashes. See: https://en.wikipedia.org/wiki/Dash

Heh, at work, and all the books I work on at "closed spacing" (no spaces around the dashes). Saves me a bunch of headaches!

Last edited by Tex2002ans; 01-09-2015 at 03:32 PM.
Tex2002ans is offline   Reply With Quote
Advert
Old 01-09-2015, 04:51 PM   #6
Notjohn
mostly an observer
Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.
 
Posts: 1,515
Karma: 987654
Join Date: Dec 2012
Device: Kindle
I've always had this worrisome feeling that regex -- the very term! -- is beyond my payscale. Now that I see an example of regex in action, I know that I was right to be fearful.
Notjohn is offline   Reply With Quote
Old 01-09-2015, 05:01 PM   #7
1v4n0
Groupie
1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.
 
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
Quote:
Originally Posted by Tex2002ans View Post
As theducks said. Especially in Regex, all three symbols mean VERY different things:
I meant "what's the difference in the english language between the word 'bracket" and the word 'parenthesis'. Anyway thank you for the thorough explanation.

Quote:
Originally Posted by Tex2002ans View Post
You are talking about "open spacing" around dashes.
Yeah, and also, "space dash word" does not appear in the spellcheck, at least not with perfectEpub.

I'm learning a lot stuff here. Maybe someday someone will pay me for all the things I know
1v4n0 is offline   Reply With Quote
Old 01-09-2015, 05:52 PM   #8
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by 1v4n0 View Post
Yeah, and also, "space dash word" does not appear in the spellcheck, at least not with perfectEpub.
Ahh yes yes, I see what you mean. Regex would definitely help in that situation!

Quote:
Originally Posted by 1v4n0 View Post
I'm learning a lot stuff here. Maybe someday someone will pay me for all the things I know
Everyone starts somewhere!

The more you mess around with it, you just slowly will absorb more and more knowledge. Next thing you know, you will have lots of regexfu, and you will be up there with the masters!

Quote:
Originally Posted by Notjohn View Post
I've always had this worrisome feeling that regex -- the very term! -- is beyond my payscale. Now that I see an example of regex in action, I know that I was right to be fearful.
No need to be "fearful". If you stick with the super basic stuff (like the ones I gave above), you will most likely not run into any sorts of problems (although I stress, you shouldn't run Regex willy nilly, and never Replace All, unless you know EXACTLY what the Regex is doing, and have tested it thoroughly).

Regular Expressions save me TONS of hours of work every single day, and it allows me to catch really obscure/hard typos, that otherwise would be quite hard to spot with just your naked eye (or old school Search/Replace). I have even caught hundreds of typos in books that have been professionally edited and then laid out by a pro typographer. (Working directly in code = way more granularity than at the GUI Word Processor/Typography programs).

For example, in a similar vein as an accidental hyphen attached to a word, sometimes OCR causes accidental commas attached to a word:

Quote:
This is a sample sentence,this is another example.
This can easily be caught with something like:

Search: ([a-z]),([a-z])
Replace: \1, \2

or one that I catch ALL THE TIME in books is with page numbers accidentally attached to the "p.":

Quote:
This is a sample quote (LastName, p.123).
Search: (p\.)([0-9])
Replace: \1 \2

Regex also helps when you have to clean up a lot of abysmal code. For example, cleaning up all of the Calibre### classes, or the absolutely atrocious InDesign overrides.

I don't know how I survived before I knew Regex!
Tex2002ans is offline   Reply With Quote
Old 01-12-2015, 04:04 PM   #9
Notjohn
mostly an observer
Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.
 
Posts: 1,515
Karma: 987654
Join Date: Dec 2012
Device: Kindle
>Regex also helps when you have to clean up a lot of abysmal code. For example, cleaning up all of the Calibre### classes, or the absolutely atrocious InDesign overrides.

Well, at least I am spared those two needs!
Notjohn is offline   Reply With Quote
Old 01-14-2015, 04:34 PM   #10
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,462
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by Notjohn View Post
>Regex also helps when you have to clean up a lot of abysmal code. For example, cleaning up all of the Calibre### classes, or the absolutely atrocious InDesign overrides.

Well, at least I am spared those two needs!
RegexBuddy® is your friend. Even power-users here use it. It's cheap (what, $40-45USD?) and it will save you untold hours of hair-pulling.

Hitch
Hitch is offline   Reply With Quote
Old 01-16-2015, 04:53 PM   #11
Notjohn
mostly an observer
Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.
 
Posts: 1,515
Karma: 987654
Join Date: Dec 2012
Device: Kindle
Yes, $39.95 with a 90-day warranty. I suppose I should....

Thank you, Hitch.
Notjohn is offline   Reply With Quote
Old 01-17-2015, 02:07 PM   #12
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,462
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by Notjohn View Post
Yes, $39.95 with a 90-day warranty. I suppose I should....

Thank you, Hitch.
You know that I always look out for you, NJ. You make enough dough, you can splurge a bit on RB.

Hitch
Hitch is offline   Reply With Quote
Reply

Tags
regex, search and replace, sigil


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Replacing your reader jbcohen General Discussions 44 04-19-2012 01:54 PM
Replacing code without replacing text? ElMiko Sigil 6 11-30-2011 08:14 PM
Replacing my PC with my Tab wodin Android Devices 3 09-28-2011 02:20 PM
Replacing my Hanlin V3 maddz Which one should I buy? 0 11-25-2010 04:14 AM
Replacing ¬ PieOPah Workshop 5 12-17-2008 04:25 PM


All times are GMT -4. The time now is 09:05 AM.


MobileRead.com is a privately owned, operated and funded community.