MobileRead Forums - View Single Post

Ben_B · 05-22-2008, 01:31 AM

As for the links to the full stories from the Globe and Mail, I was using the following function to retrieve the full stories from the Globe Investor web site in the profile I posted earlier. The Globe Investor produces a very nice printed version without any extra HTML. I was using the function to created printed versions of the news stories from the Globe and Mail RSS feeds (i.e., http://www.theglobeandmail.com/gener...s/BN/Front.xml).

def print_version(self, url):
return 'http://www.globeinvestor.com/servlet/ArticleNews/print/' + (url.split('/story/',1)[1]).split('.',1)[0] + '/' + url.rsplit('.',3)[2] + '/' + url.rsplit('.',3)[3]

The problem I ran into is that most of the full stories are contained within the tag <feedburner

rigLink>. With the old libprs500, I was usng url_search_order = ['feedburner

riglink']. This seemed to work; however, this variable no longer seems to exist in Calibre's Basic News Recipe. I can't seem to figure out how to make Calibre follow the links contained within the <feedburner

rigLink> tags. I'm guessing I will need to process this somehow through another function?

05-22-2008, 01:31 AM	#324
Ben_B Junior Member Posts: 7 Karma: 10 Join Date: Apr 2008 Location: British Columbia, Canada Device: Sony PRS-505	As for the links to the full stories from the Globe and Mail, I was using the following function to retrieve the full stories from the Globe Investor web site in the profile I posted earlier. The Globe Investor produces a very nice printed version without any extra HTML. I was using the function to created printed versions of the news stories from the Globe and Mail RSS feeds (i.e., http://www.theglobeandmail.com/gener...s/BN/Front.xml). def print_version(self, url): return 'http://www.globeinvestor.com/servlet/ArticleNews/print/' + (url.split('/story/',1)[1]).split('.',1)[0] + '/' + url.rsplit('.',3)[2] + '/' + url.rsplit('.',3)[3] The problem I ran into is that most of the full stories are contained within the tag <feedburnerrigLink>. With the old libprs500, I was usng url_search_order = ['feedburnerriglink']. This seemed to work; however, this variable no longer seems to exist in Calibre's Basic News Recipe. I can't seem to figure out how to make Calibre follow the links contained within the <feedburnerrigLink> tags. I'm guessing I will need to process this somehow through another function?