|
|
#1 |
|
Enthusiast
![]() Posts: 26
Karma: 38
Join Date: Nov 2019
Location: Paris, France
Device: none
|
Editor plugin : problem with regex and special characters
Inside an editor plugin I'm running regex out of a Json file, like saved searches.
All works fine, except for high rank Unicode characters, for example I have : Code:
{
"case_sensitive": false,
"dot_all": false,
"find": "(‘)",
"mode": "regex",
"name": "LEFT SINGLE QUOTATION MARK REPLACE",
"replace": "'"
},
My Json file is Utf-8 encoded. I extract the pattern with : Code:
pattern=unicode(searches["find"]) I'm using the regex module and my compilation flags are : regex.VERSION1 | regex.WORD | regex.FULLCASE | regex.MULTILINE | regex.UNICODE Same problem with all Unicode characters above \u2000. Any idea to get it working ? Thanks |
|
|
|
|
|
#2 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,597
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
hard to say without looking at your code.
|
|
|
|
| Advert | |
|
|
|
|
#3 |
|
Enthusiast
![]() Posts: 26
Karma: 38
Join Date: Nov 2019
Location: Paris, France
Device: none
|
The code is rather long, but I can give some crucial points :
I extract the editor text with Code:
data=current_container.raw_data(file, decode=True, normalize_to_nfc=True) Code:
pattern = regex.compile(unicode(search['find']), flags) match = pattern.search(data) Tell me if you want more. |
|
|
|
|
|
#4 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,597
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Looks fine to me. Check if data actually contains the character you are looking for using the in operator. And check what is in search['find']
|
|
|
|
|
|
#5 |
|
Enthusiast
![]() Posts: 26
Karma: 38
Join Date: Nov 2019
Location: Paris, France
Device: none
|
Damned ! All is fine and works.
The only problem was : in my real code I have replace: "\\1" and was only detecting matches if match != replace. Obviously it could'nt be the case. Thank you Kovid for your tips and driving me to the good way. Sorry for the inconvenience. |
|
|
|
| Advert | |
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| [Editor Plugin] - Enabling 'Customize plugin' dialog directly from the Editor | thiago.eec | Development | 7 | 01-09-2019 08:05 PM |
| RegEx: anchor problem in editor | DrChiper | Editor | 4 | 04-09-2018 09:15 AM |
| Special characters font problem | dan2the6th | Editor | 6 | 09-12-2015 09:26 PM |
| Regex to remove the first 4 characters | nynaevelan | Library Management | 3 | 07-19-2014 06:41 PM |
| Glo Special characters problem | Kljunas | Kobo Reader | 3 | 01-04-2014 11:09 AM |