Originally Posted by kovidgoyal
when detecting encodings in html files calibre respects the BOM (byte order mark) over declared encodings. So if your html files start with a UTF-16 BOM, the encoding used will be utf-16
Yes, the file starts with a BOM, and looking it up, the BOM 0xFF 0xFE means UTF-16 little endian, which is exactly how it is encoded. I can also see what I did wrong when I tried to switch to UTF-8 - I had the wrong BOM in the data stream. I should have changed 0xFF 0xFE to 0xEF 0xBB 0xBF.