Originally Posted by KevinH
Okay, I think xmlescape and HTMLparser both work better with full unicode strings. At that point, all metadata has already been encoded as utf-8, so I have modified mobi_opf.py to convert all required pieces from utf-8 to full unicode, pass through the xmlescape and escape methods, and then convert back to the needed utf-8 for the opf file.
Just a heads up: there are three more places in the mobi_opf script where data gets the unescape->escape treatment in addition to the handleTag and handleMetaPairs methods.
Would it make sense to do something similar (full unicode) in those additional three locations?