Quote:
Originally Posted by theducks
Here is My find (just) Roman
Code:
<p class="\w">([CLXVI]{1,7})</p>
I do a minor adjust if additional word like Chapter are needed.
|
That's a lot cleaner than what I've used for that historically. Only thing I'd add is that the character negation
@BillPearl used above might come in handy here, too. So, to isolate the Roman headings within any <p> element, you could search for:
Code:
<p[^>]*?>([CLXVI]{1,7})</p>
The only other note I'd make is that if you are trying to capture all chapter headings with up to 100 chapters (which the inclusion of "C" would suggest), you probably need to change your quantifier range to {1,8} since technically there's one number between 1 and 100 that's 8 characters long (88 or LXXXVIII).