View Single Post
Old 07-13-2021, 04:00 PM   #12
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 47,913
Karma: 174315098
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by arakish View Post
<?xml version="1.0" encoding="utf-16"?>

the XHTML file saved, but it was completely goobly-doo with Asian ideograms instead of english latin characters. I have an Ebook project that would be fantastic if I could use UTF characters above the UTF-8. Otherwise, I do not look forward to making a bunch of PNGs of the characters I wish to use. But will if I have to... ...
UTF-16 assumes that your entire document is encoded in 2 byte blocks whereas UTF-8 does variable length blocks. When you attempted to force UTF-16, every pair of bytes was interpreted as a single character which would give, uummm, interesting results. I.e. instead of seeing a string of 0x4A, 0x7E as 'An", it would be shown as a single' 䩾' character or a single '繊' character depending on whether you used big or little endian interpretation.

Given that UTF-8 is capable of encoding the entire Unicode character set, either UTF-16 or UTF-32 are not very useful, IMHO.
DNSB is online now   Reply With Quote