|
|
#1 |
|
Member
![]() Posts: 19
Karma: 10
Join Date: Oct 2008
Device: Sony PRS-505
|
I'm having trouble getting the print versions of articles from the Orlando Sentinel. The problem is that they have completely different article numbers for the regular and print-friendly versions of a feature.
For instance: In this RSS feed: http://feeds.feedburner.com/orlandosentinel Regular version with the link provided in RSS: http://www.orlandosentinel.com/business/orl-existing-home-sales-orlando-100908,0,2581414.story Print-friendly version (link is found on regular article's page): http://www.orlandosentinel.com/business/orl-existing-home-sales-orlando-100908,0,95752,print.story The print-friendly version shows up like this in the regular version: Code:
<div><img src="/common/images/icons/atools-printer.gif" alt="Print" /><a href="/business/orl-existing-home-sales-orlando-100908,0,95752,print.story" rel="nofollow" >Print</a></div> I already tried this but I think it's just looking at the actual RSS feed instead of each article so it did not help. Code:
def print_version(self, url):
soup = self.index_to_soup(url)
for item in soup.findAll('a', attrs={'rel':'nofollow'}):
strhref = item['href']
match = strhref.find('print.story')
if match > -1:
return strhref
return None
|
|
|
|
|
|
#2 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,609
Karma: 28549044
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Code:
def print_version(self, url):
for a in self.index_to_soup(url).findAll('a', href=re.compile(r'print\.story'):
if 'Print' in a.string:
return 'http://www.orlandosentinel.com' + a['href']
return url
|
|
|
|
| Advert | |
|
|
|
|
#3 |
|
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Mar 2010
Location: Oviedo, FL
Device: Kindle 2
|
Acey,
Were you able to get your recipe to work with the Orlando Sentinel? Gatorguy |
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Merging two news sources in same recipe | cartesio | Calibre | 3 | 02-05-2012 05:05 PM |
| Catholic News Recipe Problem | funkgut | Calibre | 4 | 04-23-2010 03:08 PM |
| News recipe sorting | OzAz | Calibre | 3 | 10-30-2009 07:28 PM |
| Question on TheAtlantic News Recipe | gilamon | Calibre | 6 | 11-05-2008 04:07 PM |
| The Times news recipe? | AprilHare | Calibre | 1 | 10-10-2008 02:48 PM |