View Single Post
Old 10-09-2008, 12:47 PM   #1
Acey
Member
Acey began at the beginning.
 
Posts: 19
Karma: 10
Join Date: Oct 2008
Device: Sony PRS-505
Question Help with news recipe

I'm having trouble getting the print versions of articles from the Orlando Sentinel. The problem is that they have completely different article numbers for the regular and print-friendly versions of a feature.

For instance:

In this RSS feed: http://feeds.feedburner.com/orlandosentinel

Regular version with the link provided in RSS: http://www.orlandosentinel.com/business/orl-existing-home-sales-orlando-100908,0,2581414.story

Print-friendly version (link is found on regular article's page): http://www.orlandosentinel.com/business/orl-existing-home-sales-orlando-100908,0,95752,print.story

The print-friendly version shows up like this in the regular version:
Code:
<div><img src="/common/images/icons/atools-printer.gif" alt="Print" /><a href="/business/orl-existing-home-sales-orlando-100908,0,95752,print.story" rel="nofollow" >Print</a></div>
What would be the best way to get the printable versions instead of the regular articles?

I already tried this but I think it's just looking at the actual RSS feed instead of each article so it did not help.
Code:
def print_version(self, url):
		soup = self.index_to_soup(url)
		for item in soup.findAll('a', attrs={'rel':'nofollow'}):
			strhref = item['href']
			match = strhref.find('print.story')
			if match > -1:
				return strhref
				
			return None
Thanks in advance for any help you can provide.
Acey is offline   Reply With Quote