View Single Post
Old 05-28-2024, 10:20 PM   #2
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,859
Karma: 6120478
Join Date: Nov 2009
Device: many
I am confused. Are you talking about changing font lookup tables or changing non-utf-8 encoded files into utf-8.

If the latter, if the 8 bit non-utf-8 encoding is properly specified in the xhtml character set meta data, Sigil should recognize it and properly re-encode all xhtml files from the encoding to utf-8.

If you are talking about pasting in latin-1 or some other code page encoded text into Sigil and then trying to fix it in Sigil using Regular expression find and replace, you can do that as well since the pcre2 library used support using hex byte codes \xe1 to whatever unicode value you want.

Just look up any good reference on regular expressions or the online documentation on the pcre2 (library).

For example:

https://www.pcre.org/current/doc/html/pcre2unicode.html

where you can find \x and other escapes.
KevinH is offline   Reply With Quote