View Single Post
Old 08-22-2014, 03:07 AM   #2
knowledgecrawler
Member
knowledgecrawler began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Aug 2014
Device: kindle
Thumbs up

Quote:
Originally Posted by knowledgecrawler View Post
Hi,

I stumbled upon a hitch:-
How to extract article url from facebook feed?
There is a website which publishes the article and updates on facebook..

Here is the URL
https://www.facebook.com/feeds/page....6&format=rss20

How can we parse_feed to point the article to actual URL rather than the facebook URL?

Finally got it..
Here's the code
PHP Code:
    def parse_feeds(self):
        
feeds BasicNewsRecipe.parse_feeds(self)
        for 
feed in feeds:
            for 
article in feed.articles[:]:
                
soup BeautifulSoup(article.summary)
                
soup.find("a")
                
url urllib.unquote(x["href"])
                
re.findall(r"(http://orfonline.*mmacmaid=\d*)"url)
                if 
m:
                    
article.url m[0]        
        return 
feeds 
This does the trick...

knowledgecrawler is offline   Reply With Quote