Here is the initial code
SOURCE CODE
Code:
__license__ = 'GPL v3'
__author__ = 'Luis Hernandez'
__copyright__ = 'Luis Hernandez<tolyluis@gmail.com>'
__version__ = 'v0.5'
__date__ = '02 Feb 2011'
'''
http://www.techno-science.net
'''
class AdvancedUserRecipe1294946868(BasicNewsRecipe):
title = u'Techno-Science'
__author__ = 'Luis Hernandez'
description = 'french page about tech and science'
oldest_article = 30
max_articles_per_feed = 100
remove_javascript = True
no_stylesheets = True
use_embedded_content = False
encoding = 'ISO-8859-1'
remove_empty_feeds = True
language = 'fr_FR'
timefmt = '[%a, %d %b, %Y]'
keep_only_tags = [dict(name='div', attrs={'class':['titre','texte','conteneurEncadre']})]
remove_tags_before = dict(name='div' , attrs={'class':['headerModuleCentre']})
remove_tags_after = dict(name='div' , attrs={'class':['commentaires']})
feeds = [
(u'News', u'http://www.techno-science.net/include/news.xml')
,(u'Aeronautique', u'http://www.techno-science.net/include/news6.xml')
,(u'Transports', u'http://www.techno-science.net/include/news20.xml')
,(u'Espace', u'http://www.techno-science.net/include/news7.xml')
,(u'Energie', u'http://www.techno-science.net/include/news8.xml')
,(u'Multimedia', u'http://www.techno-science.net/include/news9.xml')
,(u'Architecture', u'http://www.techno-science.net/include/news10.xml')
,(u'Mathematiques', u'http://www.techno-science.net/include/news11.xml')
,(u'Physique', u'http://www.techno-science.net/include/news12.xml')
,(u'Astrophysique', u'http://www.techno-science.net/include/news13.xml')
,(u'Astronomie', u'http://www.techno-science.net/include/news14.xml')
,(u'Vie et Terre', u'http://www.techno-science.net/include/news24.xml')
,(u'Autres sujets', u'http://www.techno-science.net/include/news27.xml')
,(u'Retro', u'http://www.techno-science.net/include/news25.xml')
]
it's about 2 Mb file, say us something about this recipe.