![]() |
#1 |
Connoisseur
![]() Posts: 63
Karma: 46
Join Date: Feb 2011
Device: Kindle 3 (cracked screen!); PW1; Oasis
|
Modification to news.py to handle Unicode byte strings
I have just posted an updated recipe in the recipes forum for the Russian Аргументы и Факты. (#5 in https://www.mobileread.com/forums/sh...d.php?t=123726). This required modification to news.py to handle Unicode byte strings as well as str type. I'm posting these here as a suggested change which may help others who encounter file or directory names of type 'bytes'. I am not familiar enough with git to attempt a "merge directive".
1) in canonicalize_internal_url(self, url, is_link=True): replace return frozenset([(parts.netloc, (parts.path or '').rstrip('/'))]) by zzp = parts.path zzn = parts.netloc if type(zzp) != type(' '): #"<class 'bytes'>": zzp = parts.path.decode("utf-8") zzn = parts.netloc.decode("utf-8") return frozenset([(zzn, (zzp or '').rstrip('/'))]) 2) In article_downloaded(self, request, result): replace index = os.path.join(os.path.dirname(result[0]), 'index.html') by zzr = result[0] if type(zzr) != type(' '): zzr = result[0].decode("utf-8") index = os.path.join(os.path.dirname(zzr), 'index.html') |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
how to handle unicode chars in filenames in python? | At_Libitum | Development | 3 | 10-18-2013 09:18 AM |
'utf8' codec can't decode byte 0xb1 in position 18: invalid start byte | paul.westland | Calibre | 19 | 10-11-2013 01:54 PM |
ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL by | nimblebooks | Conversion | 5 | 11-04-2011 12:38 PM |
Fetch News failing (All strings must be XML compatible | nuveen | Recipes | 11 | 10-01-2011 12:01 PM |
Malformed byte sequence: Invalid byte 2 of 3-byte UTF-8 sequence. Check encoding | digireads | ePub | 3 | 04-26-2011 03:07 AM |