View Single Post
Old 08-13-2009, 07:31 AM   #656
acidzebra
Liseuse Lover
acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.
 
acidzebra's Avatar
 
Posts: 869
Karma: 1035404
Join Date: Jul 2008
Location: Netherlands
Device: PRS-505
Here is one I use for the Volkskrant (Dutch newspaper) which works well:
Code:
class AdvancedUserRecipe1249039563(BasicNewsRecipe):
    title          = u'De Volkskrant'
    oldest_article = 7
    max_articles_per_feed = 100
    no_stylesheets = True
    keep_only_tags = [dict(name='div', attrs={'id':'leftColumnArticle'}) ]
    remove_tags    = [
        dict(name='div',attrs={'class':'article_tools'}),
        dict(name='div',attrs={'id':'article_tools'}),
        dict(name='div',attrs={'class':'articletools'}),
        dict(name='div',attrs={'id':'articletools'}),
        dict(name='div',attrs={'id':'myOverlay'}),
        dict(name='div',attrs={'id':'trackback'}),
        dict(name='div',attrs={'id':'googleBanner'}),
        dict(name='div',attrs={'id':'article_headlines'}),
        ]
    extra_css      = '''
                        body{font-family:Arial,Helvetica,sans-serif; font-size:small;}
                        h1{font-size:large;}
                     '''

    feeds          = [(u'Laatste Nieuws', u'http://volkskrant.nl/rss/laatstenieuws.rss'), (u'Binnenlands nieuws', u'http://volkskrant.nl/rss/nederland.rss'), (u'Buitenlands nieuws', u'http://volkskrant.nl/rss/internationaal.rss'), (u'Economisch nieuws', u'http://volkskrant.nl/rss/economie.rss'), (u'Sportnieuws', u'http://volkskrant.nl/rss/sport.rss'), (u'Kunstnieuws', u'http://volkskrant.nl/rss/kunst.rss'), (u'Wetenschapsnieuws', u'http://feeds.feedburner.com/DeVolkskrantWetenschap'), (u'Technologienieuws', u'http://feeds.feedburner.com/vkmedia')]
acidzebra is offline