Hi buddies!
Today I am posting a recipe to a brazilian magazine named "Revista Piaui". This is a first version. There are some problems not handled yet:
1 - The sort order of articles are based in the date of publication at the feed, not in the magazine's "table of contents".
2 - There are some articles that need a subscriber code. Without that code, there are only a introduction for articles.
3 - There are some links in the feed with content "only on the site". That could be someway removed.
4 - The first article shows, in fact, the list of magazine's contents. But it could be usefull because it has also the cover.
Spoiler:
Code:
class RevistaPiaui(BasicNewsRecipe):
title = 'Revista Piaui'
__author__ = 'Diniz Bortolotto'
description = 'Revista Piaui'
publisher = 'Editora Abril'
oldest_article = 30
max_articles_per_feed = 30
category = 'literacy, magazine'
language = 'pt_BR'
publication_type = 'magazine'
use_embedded_content = False
no_stylesheets = True
remove_javascript = True
feeds = [('Revista Piaui', 'http://revistapiaui.estadao.com.br/feed/rss/edicao-atual.xml')]
keep_only_tags = [dict(name='div', attrs={'class':'content'})]
remove_tags = [
dict(name='div', attrs={'class':'compartilhar'}),
dict(name='div', attrs={'class':'size'}),
dict(name='div', attrs={'class':'divulgar'}),
dict(name='div', attrs={'class':'anuncios'}),
dict(name='div', attrs={'class':'bloco-conteudo-extra off'})
]
reverse_article_order = True