View Single Post
Old 01-14-2014, 07:16 AM   #1
mrmikel
Color me gone
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
What is format to search for unicode in calibre with regex?

I had gotten some material off the web which has unicode paragraph breaks (U+2029) in it.

I found it very hard to select just his character alone...I ended up having to select the space(s) near it as well.

What is correct regex formulation to select such a unicode character generally so I can pull out just the character?

I tried several combinations using \x and curly brackets, but nothing seemed to work.

As I wrote this, I realized I can get this through special characters, diacritic marks...thanks again for it.

But it might be faster for some more obscure ones to just find out the number and be able to regex it out.
mrmikel is offline   Reply With Quote