View Single Post
Old 07-23-2014, 09:50 AM   #11
DaltonST
Deviser
DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.
 
DaltonST's Avatar
 
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
My native Python 2.7 IDLE accepts my utf-8 u'N\xe3o-fic\xe7\xe3o' when I copy it in, and then prints it correctly as Não-ficção on my screen.

I do not use any byte strings. Pure utf-8. I use temp tables in metadata.db, and they show the utf-8 unicode strings properly as Não-ficção on my pc display using a SQLite management application. So metadata.db has the pure utf-8 data, and I get it back from there to update it in mi.set_user_metadata('#customxxx', custcol).

According to https://docs.python.org/2/howto/unicode.html, "Under the hood, Python represents Unicode strings as either 16- or 32-bit integers, depending on how the Python interpreter was compiled". That doesn't mean that it does not support utf-8.

The same source also says: "UTF-8 is one of the most commonly used encodings. UTF stands for “Unicode Transformation Format”, and the ‘8’ means that 8-bit numbers are used in the encoding. (There’s also a UTF-16 encoding, but it’s less frequently used than UTF-8.) "

Calibre's personal copy of Python 2.7x apparently does not support utf-8, although SQLite does. Otherwise, metadata.db would not be updated correctly.

Kovid, thanks again for you help.
DaltonST is offline   Reply With Quote