As for the links to the full stories from the Globe and Mail, I was using the following function to retrieve the full stories from the Globe Investor web site in the profile I posted earlier. The Globe Investor produces a very nice printed version without any extra HTML. I was using the function to created printed versions of the news stories from the Globe and Mail RSS feeds (i.e.,
http://www.theglobeandmail.com/gener...s/BN/Front.xml).
def print_version(self, url):
return 'http://www.globeinvestor.com/servlet/ArticleNews/print/' + (url.split('/story/',1)[1]).split('.',1)[0] + '/' + url.rsplit('.',3)[2] + '/' + url.rsplit('.',3)[3]
The problem I ran into is that most of the full stories are contained within the tag <feedburner
rigLink>. With the old libprs500, I was usng url_search_order = ['feedburner
riglink']. This seemed to work; however, this variable no longer seems to exist in Calibre's Basic News Recipe. I can't seem to figure out how to make Calibre follow the links contained within the <feedburner
rigLink> tags. I'm guessing I will need to process this somehow through another function?