View Single Post
Old 09-21-2010, 05:11 PM   #2792
krunk
Member
krunk began at the beginning.
 
Posts: 19
Karma: 10
Join Date: Feb 2010
Location: Los Angeles, CA
Device: Kindle 3
Ok, now I'm attempting to remove duplicate urls that might appear in multiple feeds that I'm aggregating.

I created a list called 'added_links' then overloaded the is_link_wanted method like so:

Code:
    def is_link_wanted(self, url, tag):
        wanted = False

        if url not in self.added_links:
            self.added_links.append(url)
            wanted = True

        return wanted
This seems to accurately catch duplicate urls. If I print out the "added_links" array, each url is only listed once. However, the duplicate articles/urls still appear in final ebook.
krunk is offline