MobileRead Forums - View Single Post

Doitsu · 06-11-2013, 07:01 AM

Quote:

Originally Posted by mzmm

am i right to understand that this pattern \p{L} falls under the category of unicode properties, and is not supported by python interpreters?

I don't know which Unicode properties they referred to, but you could easily find out whether \p{L} is supported by the regex engine of your editor by actually using it.

For example the following regex, which works in Sigil, will find Greek text (due to its simple design, it'll also find double spaces).

[\p{Greek}| ]{2,}

To test it, just copy any Greek text (e.g. μὴ μοῦ τοὺς κύκλους τάραττε.) in a text file and use the above regex. If the regex engine of your Editor supports \p{L}, it should find the complete phrase (and the space before it).