Quote:
Originally Posted by Wiggo
|
I'm pretty sure it has.
Up to now I was totally unaware that there a several possible UTF-8 encodings for the same (visual) character. Seems like one can either simply use an "ö" character ("normalized form") like everyone does, or use "o" and put a two-dots-on-top character over it ("composed form"). I checked, and of course DNB delivers its data in the composed form.
kovidgoyal wrote "calibre usually normalizes metadata it reads from most sources", which probably means Calibre plugins like the DNB plugin should normalize those composed UTF-8 texts. So I'm going to do that.
I'm somewhat busy ATM, so this could take a while. Btw: I happily accept pull requests