02-28-2011, 05:33 PM | #1 |
Connoisseur
Posts: 60
Karma: 10
Join Date: Dec 2010
Device: kindle
|
Different encodings in one news source
I've got a news source that uses different encodings for different feeds and the automated detection does not get it right sometimes. Any ideas how to handle this? I read in the manual that 'encoding' can be callable with two arguments but I'm not sure what the 'source' as second argument means - feed source or article source? If it is downloaded article source, i can do something like source.decode(enc, 'replace'), however where do I get the current feed URL? Or is the second argument some article object?
|
03-19-2011, 04:51 PM | #2 |
Connoisseur
Posts: 60
Karma: 10
Join Date: Dec 2010
Device: kindle
|
For anyone who may be interested: The encoding function expects just one argument, source, the first one is the usual 'self' of Python. The url-dependent encoding can be done as follows:
Code:
def encoding(self, source): if source.newurl.find('blog.aktualne') >= 0: enc = 'utf-8' else: enc = 'iso-8859-2' self.log.debug('Called encoding ' + enc + " " + str(source.newurl)) return source.decode(enc, 'replace') |
Advert | |
|
Tags |
encoding |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
different "delete downloaded news" values per news source? | scottfree | Calibre | 1 | 01-19-2011 11:15 AM |
Custom News Source | scrumhalf | Recipes | 5 | 11-26-2010 11:30 AM |
Best English News Source? | Gideon | Reading Recommendations | 24 | 11-16-2010 05:14 PM |
customize new source to Fetch News | gustavoleo | Recipes | 0 | 11-09-2010 06:01 PM |
Custom news source | JayCeeEll | Calibre | 2 | 11-14-2009 04:01 AM |