View Single Post
Old 11-11-2017, 07:26 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,716
Karma: 28549306
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You always feed bytes to urllib.quote() and hash.update() not unicode strings. Also why are you hashing at all? Just use the URL as the key. And you can greatly simplify the code to something like

Code:
import cPickle
items = set()
if exists(file):
    items = cPickle.loads(open(file, 'rb').read())

feed.articles = [a for a in feed.articles if a.url not in items]
items |= {a.url for a in feed.articles}
open(file, 'wb').write(cPickle.dumps(items, -1))
kovidgoyal is online now   Reply With Quote