02-02-2011, 05:18 PM | #1 |
Enthusiast
Posts: 49
Karma: 196
Join Date: Jan 2011
Device: Kindle 3
|
La Tribuna de (v1.0) - Spanish/Spain
Hi all:
RELEASE NOTES I just write a "recycled" code from "La tribuna de Talavera" and convert it in a plus generic "La Tribuna de". Now the recipe takes feeds from:
All the webpages are owned by Grupo PROMECAL, are local news. CHANGELOG - Added new feeds that fits with the original code from La Tribuna de Talavera - Minor style changes made. SOURCE CODE Code:
__license__ = 'GPL v3' __author__ = 'Luis Hernandez' __copyright__ = 'Luis Hernandez<tolyluis@gmail.com>' __version__ = 'v1.0' __date__ = '01 Feb 2011' ''' http://www.promecal.es/ ''' class AdvancedUserRecipe1294946868(BasicNewsRecipe): title = u'La Tribuna de' publisher = u'Grupo PROMECAL' __author__ = 'Luis Hernández' description = 'Varios diarios locales del grupo PROMECAL' oldest_article = 3 max_articles_per_feed = 50 remove_javascript = True no_stylesheets = True use_embedded_content = False encoding = 'utf-8' language = 'es_ES' timefmt = '[%a, %d %b, %Y]' keep_only_tags = [ dict(name='div', attrs={'id':['articulo']}) ,dict(name='div', attrs={'class':['foto']}) ,dict(name='p', attrs={'id':['texto']}) ] remove_tags_before = dict(name='div' , attrs={'class':['comparte']}) remove_tags_after = dict(name='div' , attrs={'id':['relacionadas']}) remove_tags = [ dict(name='div', attrs={'id':['relacionadas']}) ,dict(name='h3') ,dict(name='h5') ] extra_css = """ p{text-align: justify; font-size: 100%} body{text-align: left; font-family: serif; font-size: 100%} h1{font-family: sans; font-size:150%; font-weight: bold; text-align: justify;} h2{font-family: sans-serif; font-size:85%; font-style: italic; text-align: justify;} h4{font-family: sans; font-size:75%; font-weight: bold; text-align: center;} img{margin-bottom: 0.4em} """ def preprocess_html(self, soup): for alink in soup.findAll('a'): if alink.string is not None: tstr = alink.string alink.replaceWith(tstr) return soup feeds = [ (u'Albacete', u'http://www.latribunadealbacete.es/rss.html') ,(u'Avila', u'http://www.diariodeavila.es/rss.html') ,(u'Burgos', u'http://www.diariodeburgos.es/rss.html') ,(u'Ciudad Real', u'http://www.latribunadeciudadreal.es/rss.html') ,(u'Palencia', u'http://www.diariopalentino.es/rss.html') ,(u'Puertollano', u'http://www.latribunadepuertollano.es/rss.html') ,(u'Talavera de la Reina', u'http://www.latribunadetalavera.es/rss.html') ,(u'Toledo', u'http://www.latribunadetoledo.es/rss.html') ,(u'Valladolid', u'http://www.eldiadevalladolid.com/rss.html') ] I suggest to keep intact the original filename (la_tribuna.recipe) I think is good as is, needed a change in the title to fit it in the new content Hope you enjoy it! |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
20 Minutos (boletín) + La tribuna de Talavera | tolyluis | Recipes | 3 | 01-28-2011 12:46 PM |
Spanish firm Grammata will sell Alex reader in Spain (and EU and Latin America)! | Geppetto | News | 0 | 04-14-2010 02:46 PM |
Greetings from Spain | Atom_ | Introduce Yourself | 7 | 01-27-2009 12:02 PM |
Hello from Spain! | Carlosinter | Introduce Yourself | 2 | 08-29-2008 06:31 PM |
New from Spain | anahid | Introduce Yourself | 3 | 08-29-2008 04:54 AM |