MobileRead Forums - View Single Post

kovidgoyal · 07-06-2014, 06:09 AM

When you compare a unicode string to a bytestring in python, python will try to autoconvert the bytestring into unicode for the comparison, using a "default" encoding that is system dependent.

You should almost always manually decode strings that come from text files before doing anything with them. There are routines to help you do that in the calibre.ebooks.chardet module.

Or see the implementation of the decode() method http://manual.calibre-ebook.com/poli...ntainer.decode

07-06-2014, 06:09 AM	#2
kovidgoyal creator of calibre Posts: 45,641 Karma: 28549046 Join Date: Oct 2006 Location: Mumbai, India Device: Various	When you compare a unicode string to a bytestring in python, python will try to autoconvert the bytestring into unicode for the comparison, using a "default" encoding that is system dependent. You should almost always manually decode strings that come from text files before doing anything with them. There are routines to help you do that in the calibre.ebooks.chardet module. Or see the implementation of the decode() method http://manual.calibre-ebook.com/poli...ntainer.decode