Thread: Regex examples
View Single Post
Old 07-18-2017, 11:16 AM   #521
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Hi

To avoid a disgraceful linebreak between the name of a ruler and his number, the French insert a no-break space between them (here represented by _. Thus we find, Charles_XII, Henri_II. This rule applies even for Louis_XVI...

I'd like to write a regex to add automatically the missing no-break spaces. We have two parts: a surname beginning with a capital letter and a number written with Roman numerals.

The following regex finds all of them
Code:
([A-Z])([a-z]+)\s(I|V|X)+
I dropped the L because a revolution should take place long before number fifty.

However, this regex is a little too greedy because it also works for Hans Viktor, Si Votre..., Pour Vienne...,

So I'd like to be sure it should not work if the Roman numerals are followed by lower case letters. Could some kind helping hand improve this regex?

Last edited by roger64; 07-18-2017 at 11:22 AM.
roger64 is offline   Reply With Quote