View Single Post
Old 11-11-2017, 07:26 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,699
Karma: 28549304
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You always feed bytes to urllib.quote() and hash.update() not unicode strings. Also why are you hashing at all? Just use the URL as the key. And you can greatly simplify the code to something like

Code:
import cPickle
items = set()
if exists(file):
    items = cPickle.loads(open(file, 'rb').read())

feed.articles = [a for a in feed.articles if a.url not in items]
items |= {a.url for a in feed.articles}
open(file, 'wb').write(cPickle.dumps(items, -1))
kovidgoyal is offline   Reply With Quote