Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 10-31-2010, 02:05 AM   #1
arvoredo
Junior Member
arvoredo began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Oct 2010
Device: Kindle 3
Lightbulb ClicRBS, Zero Hora, O Pioneiro, portuguese BR recipe

This is my very first second python recipe for the Kindle using Calibre and I'm happy to share it . It's been based on South-brazilian online papers from ClicRBS RSS feeds, like Zero Hora, O Pioneiro and other portuguese BR newsfeeds.

It still needs some polishing out, so please post what you find out. If you have a better recipe, please feel free to contribute.

Cheers.

First code is now obsolete and is here just for the sake of:
Spoiler:
Code:

class ClicRBS(BasicNewsRecipe):
    title          = u'ClicRBS'
    oldest_article = 3
    max_articles_per_feed = 9
    cover_url             = 'http://www.publicidade.clicrbs.com.br/clicrbs/imgs/logo_clic.gif'

    remove_tags = [
                       dict(name='div', attrs={'class':['clic-barra-inner', 'botao-versao-mobile ']})
                        ]

    remove_tags_before = dict(name='div ', attrs={'class':'descricao'})
    remove_tags_before = dict(name='div', attrs={'id':'glb-corpo'})
    remove_tags_before = dict(name='div', attrs={'class':'descricao'})
    remove_tags_before = dict(name='div', attrs={'class':'coluna'})
    remove_tags_after = dict(name='div', attrs={'class':'extra'})
    remove_tags_after = dict(name='div', attrs={'id':'links-patrocinados'})
    remove_tags_after = dict(name='h4', attrs={'class':'tipo-c comente'})
    remove_tags_after = dict(name='ul', attrs={'class':'lista'})

    feeds = [
               (u'zerohora.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?uf=1&local=1&channel=13')
             , (u'diariocatarinense.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?uf=2&local=18&channel=67')
             , (u'Concursos e Emprego', u'http://g1.globo.com/Rss2/0,,AS0-9654,00.xml')
             , (u'Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?channel=87&uf=1&local=1')
             , (u'Economia, zerohora.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=801&uf=1&local=1&channel=13')
             , (u'Esportes, zerohora.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=802&uf=1&local=1&channel=13')
             , (u'Economia, Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1180&channel=87&uf=1&local=1')
             , (u'Política, Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1185&channel=87&uf=1&local=1')
             , (u'Mundo, Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1184&channel=87&uf=1&local=1')
             , (u'Catarinense, Esportes, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=&theme=371&uf=2&channel=2')
             , (u'Geral, Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1183&channel=87&uf=1&local=1')
             , (u'Estilo de Vida, zerohora.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=805&uf=1&local=1&channel=13')
             , (u'Corrida, Corrida, Esportes, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1313&theme=15704&uf=1&channel=2')
             , (u'Jornal de Santa Catarina, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?espid=159&uf=2&local=18')
             , (u'Grêmio, Futebol, Esportes, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=11&theme=65&uf=1&channel=2')
             , (u'Velocidade, Esportes, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?sect_id=1314&theme=2655&uf=1&channel=2')
            ]

    extra_css = '''
                    cite{color:#007BB5; font-size:xx-small; font-style:italic;}
                    body{font-family:Arial,Helvetica,sans-serif;font-size:x-small;}
                    h3{font-size:large; color:#082963; font-weight:bold;}
                    #ident{color:#0179B4; font-size:xx-small;}
                    p{color:#000000;font-weight:normal;}                    
                    .commentario p{color:#007BB5; font-style:italic;}
                '''



Second code, still valid but very unpolished:
Code:
Spoiler:
class ZH(BasicNewsRecipe): title = u'ZH em testes' oldest_article = 3 max_articles_per_feed = 9 cover_url = 'http://www.publicidade.clicrbs.com.br/clicrbs/imgs/logo_clic.gif' remove_tags = [ dict(name='div', attrs={'class':['clic-barra-inner', 'botao-versao-mobile ']}) ] remove_tags_before = dict(name='div ', attrs={'class':'descricao'}) remove_tags_before = dict(name='div', attrs={'id':'glb-corpo'}) remove_tags_before = dict(name='div', attrs={'class':'descricao'}) remove_tags_before = dict(name='div', attrs={'class':'coluna'}) remove_tags_after = dict(name='div', attrs={'class':'extra'}) remove_tags_after = dict(name='div', attrs={'id':'links-patrocinados'}) remove_tags_after = dict(name='h4', attrs={'class':'tipo-c comente'}) remove_tags_after = dict(name='ul', attrs={'class':'lista'}) feeds = [ (u'zerohora.com, clicRBS', u'http://br.zerohora.feedsportal.com/c/33341/f/566001/index.rss') , (u'Economia, zerohora.com', u'http://br.zerohora.feedsportal.com/c/33341/f/566002/index.rss') , (u'Segundo Caderno, zerohora.com', u'http://br.zerohora.feedsportal.com/c/33341/f/566004/index.rss') , (u'Pioneiro.com, clicRBS', u'http://www.clicrbs.com.br/jsp/rssfeed.jspx?channel=87&uf=1&local=1') , (u'Paulo SantAna', u'http://br.zerohora.feedsportal.com/c/33341/f/566007/index.rss') , (u'Wianey Carle', u'http://br.zerohora.feedsportal.com/c/33341/f/566009/index.rss') ] extra_css = ''' cite{color:#007BB5; font-size:xx-small; font-style:italic;} body{font-family:Arial,Helvetica,sans-serif;font-size:x-small;} h3{font-size:large; color:#082963; font-weight:bold;} #ident{color:#0179B4; font-size:xx-small;} p{color:#000000;font-weight:normal;} .commentario p{color:#007BB5; font-style:italic;} '''

Last edited by arvoredo; 06-12-2011 at 07:09 PM. Reason: Older code was obsolete
arvoredo is offline   Reply With Quote
Reply

Tags
brasil, brazil, clicrbs, jornal zero hora pioneiro, rio grande do sul

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Wki translation to Brazilian Portuguese epklein Feedback 3 08-11-2010 12:22 PM
Have a Kindle... Read Portuguese? daffy4u News 4 01-22-2010 10:21 AM
BeBook video reviews (only in Portuguese, sorry!) gmvasco HanLin eBook 8 04-03-2009 02:03 AM
Portuguese ruibittencourt Lounge 18 03-09-2009 04:07 PM
a portuguese forum Vi0linha Feedback 0 02-27-2009 12:16 AM


All times are GMT -4. The time now is 06:37 PM.


MobileRead.com is a privately owned, operated and funded community.