|  05-05-2022, 05:47 AM | #721 | 
| Enthusiast  Posts: 30 Karma: 10 Join Date: Mar 2019 Location: Slovenia Device: PocketBoot Inkpad 3 | 
			
			Any idea on how to capture uppercase words with special diacritic characters, like Ū Ṃ Ḥ Ū etc.? I tried the following, but it doesn't work. I want to capture uppercase words with 2 or more characters. Code: ([[:upper:]]{2,}) | 
|   |   | 
|  05-05-2022, 06:15 AM | #723 | 
| Enthusiast  Posts: 30 Karma: 10 Join Date: Mar 2019 Location: Slovenia Device: PocketBoot Inkpad 3 | 
			
			@BeckyEbook, thank you!
		 | 
|   |   | 
|  05-05-2022, 06:55 AM | #724 | 
| Grand Sorcerer            Posts: 28,869 Karma: 207000000 Join Date: Jan 2010 Device: Nexus 7, Kindle Fire HD | 
			
			Also remember that \p{Lu} and \p{Ll} can be used to match any uppercase (and consequently, lowercase) letter in any language without requiring the *UCP switch (in Sigil's PCRE regex engine). \p{L} matches any letter (Unicode or otherwise) and \P{L} matches anything NOT a letter. So (\p{Lu}{2,}) should theoretically do the same thing (not near a machine to verify syntax). See the Unicode Categories section of https://www.regular-expressions.info/unicode.html for more categories. | 
|   |   | 
|  08-18-2022, 01:51 PM | #725 | 
| Connoisseur  Posts: 52 Karma: 10 Join Date: Sep 2021 Location: Upstate NY, USA Device: iPad Pro, Kindle basic | 
			
			oh.... wow.  49 pages over the course of ten years?!  well, this Regex newbie's got a lot of reading homework, it seems.
		 | 
|   |   | 
| Advert | |
|  | 
|  08-18-2022, 02:48 PM | #726 | 
| Connoisseur  Posts: 52 Karma: 10 Join Date: Sep 2021 Location: Upstate NY, USA Device: iPad Pro, Kindle basic | 
			
			Okay, after reading the <i>, <em> or <span> for italics thread from 2020, and then reading the Extended <head> chapter: NOT necessary? 2017 thread linked therein [and paying particular attention to  Tex2002ans posting about the underlying purposes for <em> and <i> <em>therein</em> (  ) ], I've seen the error of my ways regarding using <span> for setting italics. 
 I've figured out that Code: <span class="italics">([^>]+)</span> I'm happy to do the legwork and the trial-and-error to learn what works. I guess my search skills also need an update, too, because the results I am turning up don't seem to work for me.  Can someone help point me in the right direction? [edit] Okay, I THINK I found it, but it was hit-or miss, because it seemed that everything was for Javascript/C##/VB.net/PHP/ruby/etc.  so, it seems that some trial-and-error resulted in me learning about <i>backreferences</i> and <i>capture groups</i>.  I've gotten it to work so that Code: <em>\g<1></em>  Okay, next question: is this a kludge and there's a better way? or is this correct? Thanks, y'all! [/edit] Last edited by CubGeek; 08-18-2022 at 03:22 PM. | 
|   |   | 
|  08-18-2022, 04:32 PM | #727 | 
| A Hairy Wizard            Posts: 3,394 Karma: 20212733 Join Date: Dec 2012 Location: Charleston, SC today Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire | 
			
			That's pretty advanced stuff! I go pretty easy...and it seems to work so far... find: <i>(.*?)</i> replace: <em>\1</em> or find: <span class="italics>(.*?)</span> replace: <em>\1</em> etc. | 
|   |   | 
|  08-18-2022, 10:11 PM | #728 | |
| Connoisseur  Posts: 52 Karma: 10 Join Date: Sep 2021 Location: Upstate NY, USA Device: iPad Pro, Kindle basic | Quote: 
   | |
|   |   | 
|  08-18-2022, 11:04 PM | #729 | ||||
| Wizard            Posts: 2,306 Karma: 13057279 Join Date: Jul 2012 Device: Kobo Forma, Nook | Quote: 
   The easiest way to do it is to use DiapDealer's fantastic "TagMechanic" plugin. I explained how to install Sigil plugins in this 2021 post. And I gave step-by-step instructions on how to use TagMechanic here: That will help mass convert your <span class="italics"> -> <i> or <em>. It will be much safer than trying to use Regular Expressions, because regex can't safely handle complicated cases of <span>s inside of <span>s. Quote: 
 Replace: <i>\1</i> You see the parentheses you wrapped around your stuff? That's called a "Capture Group". Explanation of the Find Let's break it down into each piece: 
 It's saying: 
 Now when you're Replacing, you can use \1 to get "Group #1". Explanation of the Replace 
 - - - Side Note: If you have more complicated regex, you can get up to 9 capture groups! \1, \2, \3, [...], \9 But at that point, it's probably smarter to split your search/replaces into smaller pieces. - - - Side Note #2: If you want some more Regex tricks, I just wrote a post a few months ago here: which linked to some of my other posts over the years. I break down + color-coordinate many of the ones I use.  Quote: 
 Easier/Safer to use Tag Mechanic though. :P Quote: 
  And I don't know if you caught this topic: where I explained differences between <i> + <em> even further.   Last edited by Tex2002ans; 08-18-2022 at 11:12 PM. | ||||
|   |   | 
|  08-19-2022, 11:43 AM | #730 | ||||
| Connoisseur  Posts: 52 Karma: 10 Join Date: Sep 2021 Location: Upstate NY, USA Device: iPad Pro, Kindle basic | Quote: 
 Quote: 
  However, I like your explanation better.  Much more user friendly.   Quote: 
  Quote: 
  I'm sure I was mumbling about em's and i's and strong's and b's (oh my!) in my sleep to the annoyance of my cats   | ||||
|   |   | 
|  08-19-2022, 02:09 PM | #731 | |||
| Wizard            Posts: 2,306 Karma: 13057279 Join Date: Jul 2012 Device: Kobo Forma, Nook | Quote: 
 Code: <p class="normal"><span class="normal">This is an <span class="italics">example</span>.<sup><span class="tiny">1</span></sup></span></p> Regular Expressions would get completely confused with the 3 different </span>s, where TagMechanic would be able to figure out which </span> connects with which one.  Of course, with clean code, this wouldn't be a problem, but in real life there's always these crazy examples that creep up... and it comes to bite you in the butt later when you already accidentally did a "Replace All" 3 hours ago!  Quote: 
   You can also use those in FINDs as well! For example, one of the tricks I use is: Double Word Check Find: (\b[a-z]+) (\1\b) Replace: \1 This grabs a lowercase word + looks for it again: 
 How does it work? It uses a few tricks: 
 Shove all that in GROUP 1. 
 Shove all that in GROUP 2. Now, when you replace, you're only replacing with GROUP 1, meaning that duplicated word never makes it: 
  - - - Usage Note: You do have to be careful of false positives though, so NEVER do a "Replace All". Always do a one-by-one check. There shouldn't ever be too many "doubles" within your book, but they're an extremely common typo that's very hard to catch. (Usually the human brain just skips right over them.) - - - Quote: 
 Glad to see someone benefited from all those in-depth discussions.   Last edited by Tex2002ans; 08-19-2022 at 02:12 PM. | |||
|   |   | 
|  08-19-2022, 02:25 PM | #732 | 
| Resident Curmudgeon            Posts: 80,694 Karma: 150249619 Join Date: Nov 2006 Location: Roslindale, Massachusetts Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3 | 
			
			Use <i> and <b> and forget <em> and <strong> ever existed.
		 | 
|   |   | 
|  08-19-2022, 02:35 PM | #733 | 
| Grand Sorcerer            Posts: 28,869 Karma: 207000000 Join Date: Jan 2010 Device: Nexus 7, Kindle Fire HD | 
			
			Drop it Jon. Your preferences are not really relevant to the conversation at hand.
		 | 
|   |   | 
|  08-19-2022, 04:28 PM | #734 | 
| Connoisseur  Posts: 52 Karma: 10 Join Date: Sep 2021 Location: Upstate NY, USA Device: iPad Pro, Kindle basic | 
			
			After reading threads that spanned (ha! <span>ned!   ) 5+ years, and seeing you spouting the same thing about <i> and <em> and <b> and <strong> (regardless of being educated better), I'll at least give you credit for consistency.  But that's all.  Thanks for your input. | 
|   |   | 
|  08-19-2022, 04:30 PM | #735 | |
| Connoisseur  Posts: 52 Karma: 10 Join Date: Sep 2021 Location: Upstate NY, USA Device: iPad Pro, Kindle basic | Quote: 
 So, if my learning how to properly show varying types of emphasis to help convey nuances for someone who's relying on a screen-reader or similar (on the very infinitesimal chance they access something that I put together) then it was time well-spent.   | |
|   |   | 
|  | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Examples of Subgroups | emonti8384 | Lounge | 32 | 02-26-2011 06:00 PM | 
| Accessories Pen examples | Gunnerp245 | enTourage Archive | 15 | 02-21-2011 03:23 PM | 
| Stylesheet examples? | Skitzman69 | Sigil | 15 | 09-24-2010 08:24 PM | 
| Examples | kafkaesque1978 | iRiver Story | 1 | 07-26-2010 03:49 PM | 
| Looking for examples of typos in eBooks | Tonycole | General Discussions | 1 | 05-05-2010 04:23 AM |