Junior Member
Posts: 4
Karma: 10
Join Date: Oct 2010
Device: Kindle 3
ClicRBS, Zero Hora, O Pioneiro, portuguese BR recipe
This is my
very first second python recipe for the Kindle using Calibre and I'm happy to share it
. It's been based on South-brazilian online papers from ClicRBS RSS feeds, like Zero Hora, O Pioneiro and other portuguese BR newsfeeds.
It still needs some polishing out, so please post what you find out. If you have a better recipe, please feel free to contribute.
Cheers.
First code is now obsolete and is here just for the sake of:
Spoiler :
Code:
class ClicRBS(BasicNewsRecipe):
title = u'ClicRBS'
oldest_article = 3
max_articles_per_feed = 9
cover_url = 'http://www.publicidade.clicrbs.com.br/clicrbs/imgs/logo_clic.gif'
remove_tags = [
dict(name='div', attrs={'class':['clic-barra-inner', 'botao-versao-mobile ']})
]
remove_tags_before = dict(name='div ', attrs={'class':'descricao'})
remove_tags_before = dict(name='div', attrs={'id':'glb-corpo'})
remove_tags_before = dict(name='div', attrs={'class':'descricao'})
remove_tags_before = dict(name='div', attrs={'class':'coluna'})
remove_tags_after = dict(name='div', attrs={'class':'extra'})
remove_tags_after = dict(name='div', attrs={'id':'links-patrocinados'})
remove_tags_after = dict(name='h4', attrs={'class':'tipo-c comente'})
remove_tags_after = dict(name='ul', attrs={'class':'lista'})
feeds = [
(u'zerohora.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?uf=1&local=1&channel=13')
, (u'diariocatarinense.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?uf=2&local=18&channel=67')
, (u'Concursos e Emprego', u'http://g1.globo.com/Rss2/0,,AS0-9654,00.xml')
, (u'Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?channel=87&uf=1&local=1')
, (u'Economia, zerohora.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=801&uf=1&local=1&channel=13')
, (u'Esportes, zerohora.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=802&uf=1&local=1&channel=13')
, (u'Economia, Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1180&channel=87&uf=1&local=1')
, (u'Política, Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1185&channel=87&uf=1&local=1')
, (u'Mundo, Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1184&channel=87&uf=1&local=1')
, (u'Catarinense, Esportes, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=&theme=371&uf=2&channel=2')
, (u'Geral, Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1183&channel=87&uf=1&local=1')
, (u'Estilo de Vida, zerohora.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=805&uf=1&local=1&channel=13')
, (u'Corrida, Corrida, Esportes, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1313&theme=15704&uf=1&channel=2')
, (u'Jornal de Santa Catarina, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?espid=159&uf=2&local=18')
, (u'Grêmio, Futebol, Esportes, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=11&theme=65&uf=1&channel=2')
, (u'Velocidade, Esportes, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1314&theme=2655&uf=1&channel=2')
]
extra_css = '''
cite{color:#007BB5; font-size:xx-small; font-style:italic;}
body{font-family:Arial,Helvetica,sans-serif;font-size:x-small;}
h3{font-size:large; color:#082963; font-weight:bold;}
#ident{color:#0179B4; font-size:xx-small;}
p{color:#000000;font-weight:normal;}
.commentario p{color:#007BB5; font-style:italic;}
'''
Second code, still valid but very unpolished:
Code:
Spoiler :
class ZH(BasicNewsRecipe):
title = u'ZH em testes'
oldest_article = 3
max_articles_per_feed = 9
cover_url = 'http://www.publicidade.clicrbs.com.br/clicrbs/imgs/logo_clic.gif'
remove_tags = [
dict(name='div', attrs={'class':['clic-barra-inner', 'botao-versao-mobile ']})
]
remove_tags_before = dict(name='div ', attrs={'class':'descricao'})
remove_tags_before = dict(name='div', attrs={'id':'glb-corpo'})
remove_tags_before = dict(name='div', attrs={'class':'descricao'})
remove_tags_before = dict(name='div', attrs={'class':'coluna'})
remove_tags_after = dict(name='div', attrs={'class':'extra'})
remove_tags_after = dict(name='div', attrs={'id':'links-patrocinados'})
remove_tags_after = dict(name='h4', attrs={'class':'tipo-c comente'})
remove_tags_after = dict(name='ul', attrs={'class':'lista'})
feeds = [
(u'zerohora.com, clicRBS', u'http://br.zerohora.feedsportal.com/c/33341/f/566001/index.rss')
, (u'Economia, zerohora.com', u'http://br.zerohora.feedsportal.com/c/33341/f/566002/index.rss')
, (u'Segundo Caderno, zerohora.com', u'http://br.zerohora.feedsportal.com/c/33341/f/566004/index.rss')
, (u'Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?channel=87&uf=1&local=1')
, (u'Paulo SantAna', u'http://br.zerohora.feedsportal.com/c/33341/f/566007/index.rss')
, (u'Wianey Carle', u'http://br.zerohora.feedsportal.com/c/33341/f/566009/index.rss')
]
extra_css = '''
cite{color:#007BB5; font-size:xx-small; font-style:italic;}
body{font-family:Arial,Helvetica,sans-serif;font-size:x-small;}
h3{font-size:large; color:#082963; font-weight:bold;}
#ident{color:#0179B4; font-size:xx-small;}
p{color:#000000;font-weight:normal;}
.commentario p{color:#007BB5; font-style:italic;}
'''
Last edited by arvoredo; 06-12-2011 at 06:09 PM .
Reason: Older code was obsolete