View Single Post
Old 04-15-2012, 10:53 AM   #1
scissors
Addict
scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.
 
Posts: 241
Karma: 1001369
Join Date: Sep 2010
Device: prs300, kindle keyboard 3g
countryfile.com update

Now uses Soup and some string slicing to get the current cover off the website

(content is unchanged)

Spoiler:
Code:
import urllib, re, mechanize
class AdvancedUserRecipe1325006965(BasicNewsRecipe):
    title          = u'Countryfile.com'
    #cover_url = 'http://www.countryfile.com/sites/default/files/imagecache/160px_wide/cover/2_1.jpg'
    __author__ = 'Dave Asbury'
    description           = 'The official website of Countryfile Magazine'
    # last updated 15/4/12
    language = 'en_GB'
    oldest_article = 30
    max_articles_per_feed = 25
    remove_empty_feeds = True
    no_stylesheets = True
    auto_cleanup = True
    #articles_are_obfuscated = True
    def get_cover_url(self):
            soup = self.index_to_soup('http://www.countryfile.com/')
            cov = soup.find(attrs={'class' : 'imagecache imagecache-160px_wide imagecache-linked imagecache-160px_wide_linked'}) 
            #print '******** ',cov,' ***'
            cov2 = str(cov)
            cov2=cov2[124:-90]
            #print '******** ',cov2,' ***'

            # try to get cover - if can't get known cover
            br = mechanize.Browser()
            br.set_handle_redirect(False)
            try:
                br.open_novisit(cov2)
                cover_url = cov2
            except:
                  cover_url = 'http://www.countryfile.com/sites/default/files/imagecache/160px_wide/cover/2_1.jpg'
            return cover_url
    remove_tags    = [
                             # dict(attrs={'class' : ['player']}),

	]
    feeds          = [
	(u'Homepage', u'http://www.countryfile.com/rss/home'),
	(u'Country News', u'http://www.countryfile.com/rss/news'),
        	(u'Countryside', u'http://www.countryfile.com/rss/countryside'),
        	]
scissors is offline   Reply With Quote