View Single Post
Old 02-07-2011, 09:28 PM   #1
cfholbert
Junior Member
cfholbert began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Feb 2011
Device: kindle, nook, nookcolor, PDN
Help Fixing sltrib Recipe

Need help fixing a custom recipe for the SLC Tribune. The recipe somewhat works, but about every third news article is garbage. Also, when I tried to add the Technology section, it did not pull any of the articles. Not sure why not as the other sections work. The recipe is given below. Many thanks!!

SLTRIB RECIPE:

Code:
from calibre.web.feeds.news import BasicNewsRecipe

class AdvancedUserRecipe1278347258(BasicNewsRecipe):
    title      = u'Salt Lake City Tribune'
    __author__ = 'Charles Holbert'
    oldest_article = 1
    max_articles_per_feed = 100

    description           = '''Utah's independent news source since 1871'''
    publisher             = 'http://www.sltrib.com/'
    category              = 'news, Utah, SLC'
    language              = 'en'
    encoding              = 'utf-8'
    remove_javascript     = True
    use_embedded_content  = False
    no_stylesheets        = True

    remove_tags = [dict(name='div',attrs={'id':['teaser','adCol', 'keywordStories']})
                  ,dict(name='div',attrs={'class':'tripleWide datos'})]


    keep_only_tags = [dict(name='div',attrs={'class':'theImage'})
                      ,dict(name='div',attrs={'id':'topImageCaption'})
                      ,dict(name='div',attrs={'class':'theHeadline entry-title'})
                      ,dict(name='div',attrs={'class':'byline'})
                      ,dict(name='div',attrs={'id':'storytext'})]

    feeds = [(u'SL Tribune Today', u'http://www.sltrib.com/csp/cms/sites/sltrib/RSS/rss.csp?cat=All'),
	       (u'Utah News', u'http://www.sltrib.com/csp/cms/sites/sltrib/RSS/rss.csp?cat=UtahNews'),
	       (u'Business News', u'http://www.sltrib.com/csp/cms/sites/sltrib/RSS/rss.csp?cat=Money'),
	       (u'Most Popular', u'http://www.sltrib.com/csp/cms/sites/sltrib/RSS/rsspopular.csp'),
	       (u'Sports', u'http://www.sltrib.com/csp/cms/sites/sltrib/RSS/rss.csp?cat=Sports')]

    extra_css = '''
                .theHeadline{font-family:Arial,Helvetica,sans-serif; font-size:xx-large; font-weight: bold; color:#0E5398;}
                .byline{font-family:Arial,Helvetica,sans-serif; color:#333333; font-size:xx-small;}
                .storytext{font-family:Arial,Helvetica,sans-serif; font-size:medium;}
                .articleText{font-family:Arial,Helvetica,sans-serif; font-size:medium;}
                .caption{font-family:Arial,Helvetica,sans-serif; font-size:xx-small; margin-bottom: 1em;}
                '''

Last edited by kovidgoyal; 02-07-2011 at 10:06 PM.
cfholbert is offline   Reply With Quote