Yes, data must be read in binary and with the right decoder.
The extension only supports utf-8 and doesn't throw an error if a web page uses another encoding,eg. Latin1/iso-8859-1. It's the first time I had the issue in the weeks I've been using it, so it's no biggie. It was the opportunity to understand how both encodings work.
For the curious in the audience, here's how utf-8 works:
1. If a byte is worth 0-127, it remains untouched
2. If it's 128-159, it's considered wrong and replaced with the sequence "0xEFBFBD", ie. "�"
3. If it's 160-255, it's the leading byte of a two-byte combo
For instance, "É" in ISO-8859-1 is 0xC9 or
11001001 in binary. To convert it to utf-8, the first two bits (11) are put in the leading byte (110000
11) and the other bits are put in the trailing byte (10
001001) → 0xC389.
https://en.wikipedia.org/wiki/UTF-8#Description