the hardest case seems to be p tags around upper case ONE TWO THREE etc, progressing up to TWENTY-ONE ....
I worked out the letter set one time, which is a subset of the upper case alphabet, then used find & replace ( stepping through manually to be safe) the "can't be bothered to do that properly" set is [-E_Y]
|