Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 10-31-2010, 01:05 AM   #1
arvoredo
Junior Member
arvoredo began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Oct 2010
Device: Kindle 3
Lightbulb ClicRBS, Zero Hora, O Pioneiro, portuguese BR recipe

This is my very first second python recipe for the Kindle using Calibre and I'm happy to share it . It's been based on South-brazilian online papers from ClicRBS RSS feeds, like Zero Hora, O Pioneiro and other portuguese BR newsfeeds.

It still needs some polishing out, so please post what you find out. If you have a better recipe, please feel free to contribute.

Cheers.

First code is now obsolete and is here just for the sake of:
Spoiler:
Code:

class ClicRBS(BasicNewsRecipe):
    title          = u'ClicRBS'
    oldest_article = 3
    max_articles_per_feed = 9
    cover_url             = 'http://www.publicidade.clicrbs.com.br/clicrbs/imgs/logo_clic.gif'

    remove_tags = [
                       dict(name='div', attrs={'class':['clic-barra-inner', 'botao-versao-mobile ']})
                        ]

    remove_tags_before = dict(name='div ', attrs={'class':'descricao'})
    remove_tags_before = dict(name='div', attrs={'id':'glb-corpo'})
    remove_tags_before = dict(name='div', attrs={'class':'descricao'})
    remove_tags_before = dict(name='div', attrs={'class':'coluna'})
    remove_tags_after = dict(name='div', attrs={'class':'extra'})
    remove_tags_after = dict(name='div', attrs={'id':'links-patrocinados'})
    remove_tags_after = dict(name='h4', attrs={'class':'tipo-c comente'})
    remove_tags_after = dict(name='ul', attrs={'class':'lista'})

    feeds = [
               (u'zerohora.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?uf=1&local=1&channel=13')
             , (u'diariocatarinense.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?uf=2&local=18&channel=67')
             , (u'Concursos e Emprego', u'http://g1.globo.com/Rss2/0,,AS0-9654,00.xml')
             , (u'Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?channel=87&uf=1&local=1')
             , (u'Economia, zerohora.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=801&uf=1&local=1&channel=13')
             , (u'Esportes, zerohora.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=802&uf=1&local=1&channel=13')
             , (u'Economia, Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1180&channel=87&uf=1&local=1')
             , (u'Política, Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1185&channel=87&uf=1&local=1')
             , (u'Mundo, Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1184&channel=87&uf=1&local=1')
             , (u'Catarinense, Esportes, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=&theme=371&uf=2&channel=2')
             , (u'Geral, Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1183&channel=87&uf=1&local=1')
             , (u'Estilo de Vida, zerohora.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=805&uf=1&local=1&channel=13')
             , (u'Corrida, Corrida, Esportes, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1313&theme=15704&uf=1&channel=2')
             , (u'Jornal de Santa Catarina, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?espid=159&uf=2&local=18')
             , (u'Grêmio, Futebol, Esportes, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=11&theme=65&uf=1&channel=2')
             , (u'Velocidade, Esportes, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1314&theme=2655&uf=1&channel=2')
            ]

    extra_css = '''
                    cite{color:#007BB5; font-size:xx-small; font-style:italic;}
                    body{font-family:Arial,Helvetica,sans-serif;font-size:x-small;}
                    h3{font-size:large; color:#082963; font-weight:bold;}
                    #ident{color:#0179B4; font-size:xx-small;}
                    p{color:#000000;font-weight:normal;}                    
                    .commentario p{color:#007BB5; font-style:italic;}
                '''



Second code, still valid but very unpolished:
Code:
Spoiler:
class ZH(BasicNewsRecipe): title = u'ZH em testes' oldest_article = 3 max_articles_per_feed = 9 cover_url = 'http://www.publicidade.clicrbs.com.br/clicrbs/imgs/logo_clic.gif' remove_tags = [ dict(name='div', attrs={'class':['clic-barra-inner', 'botao-versao-mobile ']}) ] remove_tags_before = dict(name='div ', attrs={'class':'descricao'}) remove_tags_before = dict(name='div', attrs={'id':'glb-corpo'}) remove_tags_before = dict(name='div', attrs={'class':'descricao'}) remove_tags_before = dict(name='div', attrs={'class':'coluna'}) remove_tags_after = dict(name='div', attrs={'class':'extra'}) remove_tags_after = dict(name='div', attrs={'id':'links-patrocinados'}) remove_tags_after = dict(name='h4', attrs={'class':'tipo-c comente'}) remove_tags_after = dict(name='ul', attrs={'class':'lista'}) feeds = [ (u'zerohora.com, clicRBS', u'http://br.zerohora.feedsportal.com/c/33341/f/566001/index.rss') , (u'Economia, zerohora.com', u'http://br.zerohora.feedsportal.com/c/33341/f/566002/index.rss') , (u'Segundo Caderno, zerohora.com', u'http://br.zerohora.feedsportal.com/c/33341/f/566004/index.rss') , (u'Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?channel=87&uf=1&local=1') , (u'Paulo SantAna', u'http://br.zerohora.feedsportal.com/c/33341/f/566007/index.rss') , (u'Wianey Carle', u'http://br.zerohora.feedsportal.com/c/33341/f/566009/index.rss') ] extra_css = ''' cite{color:#007BB5; font-size:xx-small; font-style:italic;} body{font-family:Arial,Helvetica,sans-serif;font-size:x-small;} h3{font-size:large; color:#082963; font-weight:bold;} #ident{color:#0179B4; font-size:xx-small;} p{color:#000000;font-weight:normal;} .commentario p{color:#007BB5; font-style:italic;} '''

Last edited by arvoredo; 06-12-2011 at 06:09 PM. Reason: Older code was obsolete
arvoredo is offline   Reply With Quote
Reply

Tags
brasil, brazil, clicrbs, jornal zero hora pioneiro, rio grande do sul


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Wki translation to Brazilian Portuguese epklein Feedback 3 08-11-2010 11:22 AM
Have a Kindle... Read Portuguese? daffy4u News 4 01-22-2010 09:21 AM
BeBook video reviews (only in Portuguese, sorry!) gmvasco HanLin eBook 8 04-03-2009 01:03 AM
Portuguese ruibittencourt Lounge 18 03-09-2009 03:07 PM
a portuguese forum Vi0linha Feedback 0 02-26-2009 11:16 PM


All times are GMT -4. The time now is 08:46 AM.


MobileRead.com is a privately owned, operated and funded community.