Thread: Regex examples
View Single Post
Old 08-18-2022, 11:04 PM   #729
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by CubGeek View Post
Okay, after reading the <i>, <em> or <span> for italics thread from 2020 [...] [and paying particular attention to Tex2002ans posting about the underlying purposes for <em> and <i> <em>therein</em> () ], I've seen the error of my ways regarding using <span> for setting italics.


The easiest way to do it is to use DiapDealer's fantastic "TagMechanic" plugin.

I explained how to install Sigil plugins in this 2021 post.

And I gave step-by-step instructions on how to use TagMechanic here:

That will help mass convert your <span class="italics"> -> <i> or <em>.

It will be much safer than trying to use Regular Expressions, because regex can't safely handle complicated cases of <span>s inside of <span>s.

Quote:
Originally Posted by CubGeek View Post
I've figured out that
Code:
<span class="italics">([^>]+)</span>
Find: <span class="italics">([^<]+)</span>
Replace: <i>\1</i>

You see the parentheses you wrapped around your stuff? That's called a "Capture Group".

Explanation of the Find

Let's break it down into each piece:
  • <span class="italics">
  • (
    • [^<]+
  • )
  • </span>

It's saying:
  • "Hey, find the italics <span>."
  • "You see this open parenthesis? Stick this next stuff into a group!"
    • "Keep grabbing everything that's NOT a '<'.
  • "Closing parenthesis? Everything captured between them goes into GROUP 1!"
  • "Hey, find the closing </span>."

Now when you're Replacing, you can use \1 to get "Group #1".

Explanation of the Replace
  • <i> = "Put the opening <i>."
  • \1 = "Put whatever was captured in GROUP 1 here."
  • </i> = "Put the closing </i>."

- - -

Side Note: If you have more complicated regex, you can get up to 9 capture groups!

\1, \2, \3, [...], \9

But at that point, it's probably smarter to split your search/replaces into smaller pieces.

- - -

Side Note #2: If you want some more Regex tricks, I just wrote a post a few months ago here:

which linked to some of my other posts over the years. I break down + color-coordinate many of the ones I use.

Quote:
Originally Posted by Turtle91 View Post
I go pretty easy...and it seems to work so far...

find: <i>(.*?)</i>
replace: <em>\1</em>

or

find: <span class="italics>(.*?)</span>
replace: <em>\1</em>
Yep, this type of stuff works too.

Easier/Safer to use Tag Mechanic though. :P

Quote:
Originally Posted by CubGeek View Post
Since the stuff I'm working on has a combination of <i> for "inside voice," and "named things" as well as <em> for word emphasis, this certainly has been a learning experience!
And I don't know if you caught this topic:

where I explained differences between <i> + <em> even further.

Last edited by Tex2002ans; 08-18-2022 at 11:12 PM.
Tex2002ans is offline   Reply With Quote