View Single Post
Old 09-29-2010, 04:18 PM   #2
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by t3d View Post
Hello!

I have some recipes that are almost ready to publish, but there are some articles that won't work on e-readers. I want to filter them out by URL. It should be easy, as their URLs contains some unique strings. Here is my try, that depicts the idea, but doesn't work at all:

Code:
    def get_article_url(self, article): 
        link = article.get('link')
        audio = link.find('audio')
        if not audio:
            return link
I am not familiar with python, so I am not sure if it should have something like "return NULL" when the string is found or not
Try something like this:
Spoiler:

Code:
 def preprocess_html(self, soup) :
        
        weblinks = soup.findAll(['a'])
        if weblinks is not None:
            for link in weblinks:
                if re.search('audio',str(link)):
                  
                  link.parent.extract()
        return soup

Last edited by TonytheBookworm; 09-29-2010 at 04:20 PM.
TonytheBookworm is offline   Reply With Quote