Thread: Regex examples
View Single Post
Old 11-01-2025, 03:37 AM   #805
ElMiko
Fanatic
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 519
Karma: 65460
Join Date: Jun 2011
Device: Kindle Voyage, Boox Go 7
Quote:
Originally Posted by theducks View Post
Here is My find (just) Roman
Code:
<p class="\w">([CLXVI]{1,7})</p>
I do a minor adjust if additional word like Chapter are needed.
That's a lot cleaner than what I've used for that historically. Only thing I'd add is that the character negation @BillPearl used above might come in handy here, too. So, to isolate the Roman headings within any <p> element, you could search for:

Code:
<p[^>]*?>([CLXVI]{1,7})</p>
The only other note I'd make is that if you are trying to capture all chapter headings with up to 100 chapters (which the inclusion of "C" would suggest), you probably need to change your quantifier range to {1,8} since technically there's one number between 1 and 100 that's 8 characters long (88 or LXXXVIII).

Last edited by ElMiko; 11-01-2025 at 03:44 AM.
ElMiko is offline   Reply With Quote