Thread: Unicode issues
View Single Post
Old 07-06-2014, 05:09 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,409
Karma: 27757236
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
When you compare a unicode string to a bytestring in python, python will try to autoconvert the bytestring into unicode for the comparison, using a "default" encoding that is system dependent.

You should almost always manually decode strings that come from text files before doing anything with them. There are routines to help you do that in the calibre.ebooks.chardet module.

Or see the implementation of the decode() method http://manual.calibre-ebook.com/poli...ntainer.decode
kovidgoyal is offline   Reply With Quote