To make things even harder, a Qt QChar is a little endian utf-16 encoding which makes offsets harder to work with without defining exactly what the basis you are using!
The validation plugin should be passed the offset in unicode codepoints if you need exact positioning in the validation result window inside Sigil. The gumbo line and col info can be used to accurately determine the offset in codepoints
If on the other hand you want to use offsets into python utf-8 bytestrings to extract things, the gumbo offsets can be used directly for that.
Last edited by KevinH; 01-02-2018 at 12:17 PM.
|