View Single Post
Old 08-21-2023, 06:18 AM   #1
Azraelo
Junior Member
Azraelo began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Jun 2023
Device: Kobo Clara HD
Question Search for unicode character (ranges)

Hello,

i've got an epub with multiple weird unicode characters and wanted to use a regex to get rid of it in the epub editor.

Example of text in the epub:
𝒷.𝓬𝑶𝐦

According to my research these character should represent the following unicode characters:
\u1D4B7
\u002E
\u1D4EC
\u1D476
\u1D426

But no matter what I try, the search function never matches those characters.
I've opened a text file within the epub editor, put "\u1D4B7" into the search part and changed the modus to "Regex".
When searching, nothing is found.
If I search for "[\u1D400-\u1D4FF]", then all normal characters are listed as match (a-zA-Z).

What is the logic behind this?

My intention was, to search for something like this:"[\u1D400-\u1D4FF\u002E]{4,20}" and replace it with nothing.
Can you please give me a hint, how to accomplish this?

Regards
Azraelo
Azraelo is offline   Reply With Quote