Hallo!
I've been building a custom news feed for the UK guardian, but I have hit a problem because my knowledge of python is none existent
Here's the rub:
The rss feed returns an article url in the format of:
Code:
http://www.guardian.co.uk/politics/2008/may/12/alistairdarling.taxandspending?gusrc=rss&feed=politics
The print version of the url looks like this:
Code:
http://www.guardian.co.uk/politics/2008/may/12/alistairdarling.taxandspending/print
So I need to replace that last bit of the url, starting the question mark (which changes from feed to feed, ie politics changes to culture) and replace it with /print.
From looking at the included feeds I think I need to be using urlparse, but can't figure out how to structure it.
My starting point is this from the WSJ recipie, but obviously it won't work in its current form:
Code:
def print_version(self, url):
article = urlparse.urlparse(url).path.rpartition('/')[-1]
return 'http://www.guardian.co.uk/'+article
Any help is MUCH appreciated.