MobileRead Forums - View Single Post - [Metadata Source Plugin] ANOBII (& inmondadori.it)

dave9000 · 09-16-2011, 08:40 AM

There's a small problem with character encoding (windows-1252).
E.g. this sentence:
"... uno dei piÃƒÂ¹ famosi scrittori ..."
should be
"... uno dei più famosi scrittori ..."

It looks like the original text is encoded TWICE with UTF-8.
Thanks for the useful plugin anyway!

P.S. the following patch in worker.py seems to fix the issue:
OLD: raw = raw.decode('windows-1252', errors='replace')
NEW: raw = raw.decode('utf-8', errors='replace')

09-16-2011, 08:40 AM	#17
dave9000 Junior Member Posts: 7 Karma: 10 Join Date: Apr 2010 Location: Italy Device: SONY PRS-650, IREX ILIAD	There's a small problem with character encoding (windows-1252). E.g. this sentence: "... uno dei piÃƒÂ¹ famosi scrittori ..." should be "... uno dei più famosi scrittori ..." It looks like the original text is encoded TWICE with UTF-8. Thanks for the useful plugin anyway! P.S. the following patch in worker.py seems to fix the issue: OLD: raw = raw.decode('windows-1252', errors='replace') NEW: raw = raw.decode('utf-8', errors='replace') Last edited by dave9000; 09-16-2011 at 01:27 PM.