View Single Post
Old 02-18-2009, 08:29 AM   #7
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
Here is extended recipe that filters out the extra trash:

Code:
class AdvancedUserRecipe1234959710(BasicNewsRecipe):
    title          = u'la Repubblica'
    oldest_article = 1
    max_articles_per_feed = 100
    remove_javascript = True
    no_stylesheets = True
    
    keep_only_tags     = [dict(name='div', attrs={'class':'articolo'})]


    remove_tags        = [
                            dict(name=['object','link'])
                           ,dict(name='span',attrs={'class':'linkindice'})
                           ,dict(name='div',attrs={'class':'bottom-mobile'})
                           ,dict(name='div',attrs={'id':['rssdiv','blocco']})
                         ]
    
    feeds          = [(u'Repubblica homepage', u'http://www.repubblica.it/rss/homepage/rss2.0.xml'), (u'Repubblica Scienze', u'http://www.repubblica.it/rss/scienze/rss2.0.xml'), (u'Repubblica Tecnologia', u'http://www.repubblica.it/rss/tecnologia/rss2.0.xml'), (u'Repubblica Esteri', u'http://www.repubblica.it/rss/esteri/rss2.0.xml')]
kiklop74 is offline   Reply With Quote