View Single Post
Old 08-14-2014, 11:58 AM   #10
Papirus
Junior Member
Papirus began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Aug 2014
Device: Papyre 613
Quote:
Originally Posted by DiapDealer View Post
I don't use really use if|then|else regex conditionals myself, but the regex module calibre's editor uses certainly supports them. Probably just a matter of getting the syntax right. For example:
Code:
(a)?b(?(1)c|d)
Matches both "bd" and "abc"
This is not the situation.

Imagine a text and that you want to look for roman numbers in order to small caps them.

([ivxlcdm]+) could be a possibility with <small>\1</small> as replacement.

The above expression will return roman numbers as well as different words that are formed with these characters.

I.e. clic, CD, ill, id, lid, livid, isolated letters, etc.

Probably in spanish there are some more frequent: Thousand (mil) and mainly My (mi)

So if I use the above expresion of course I will get romans numbers but hundreds coincidences of non roman numbers too. The only way I know to bypass this is through conditional structure (?(condition)then|else). It would be more or less (in fact is necessary to refine the expression):

Code:
(?i)(?(?=[clv]?i?d|[cm]m|[dm]i|clic|lcd|(m|ci)?v?il|[mdcl]|ill|livid) |(?<=\PL)([ivxlcdm]+)(?=\PL))
When the condition is satisfied (common words that are not roman numbers) the yes-pattern is used (we look for nothing: the white space in this example after de condition parenthesis). Otherwise we will look for romans.
Papirus is offline   Reply With Quote