View Single Post
Old 11-19-2014, 11:08 AM   #1
heYooh
Junior Member
heYooh began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Nov 2014
Device: none
Help to fix Adnkronos recipe

Hi everyone, there's this Adnkronos recipe which is not working correctly anymore because the final ebook cointains only the urls of the articles instead of the articles themselves. I opened another thread on this forum and Kovid Goyal suggested me to use get_obfuscated_article(), but it is something which goes beyond my possibilities since I'm not into programming.

Can anyone gently fix this recipe? It is a recipe already provided with Calibre, but I report it below for convenience.

Code:
#!/usr/bin/env  python
__license__   = 'GPL v3'
__author__    = 'Gabriele Marini, based on Darko Miletic'
__copyright__ = '2009-2010, Darko Miletic <darko.miletic at gmail.com>'
description   = 'Italian daily newspaper - 02-05-2010'

'''
http://www.adnkronos.com/
'''

from calibre.web.feeds.news import BasicNewsRecipe

class Adnkronos(BasicNewsRecipe):
    __author__        = 'Gabriele Marini'
    description   = 'News agency'
    cover_url      = 'http://www.adnkronos.com/IGN6/img/popup_ign.jpg'
    title          = u'Adnkronos'
    publisher      = 'Adnkronos Group - ews agency'
    category       = 'News, politics, culture, economy, general interest'

    language       = 'it'
    timefmt        = '[%a, %d %b, %Y]'

    oldest_article = 7
    max_articles_per_feed = 80
    use_embedded_content  = False
    recursion             = 10

    remove_javascript = True
    def get_article_url(self, article):
        link = article.get('id', article.get('guid', None))
        return link

    extra_css = ' .newsAbstract{font-style: italic} '
    keep_only_tags     = [dict(name='div', attrs={'class':['breadCrumbs','newsTop','newsText']})
                         ]


    remove_tags        = [
                            dict(name='div', attrs={'class':['leogoo','leogoo2']})
                         ]


    feeds          = [
                       (u'Prima Pagina', u'http://rss.adnkronos.com/RSS_PrimaPagina.xml'),
                       (u'Ultima Ora', u'http://rss.adnkronos.com/RSS_Ultimora.xml'),
                       (u'Politica', u'http://rss.adnkronos.com/RSS_Politica.xml'),
                       (u'Esteri', u'http://rss.adnkronos.com/RSS_Esteri.xml'),
                       (u'Cronoca', u'http://rss.adnkronos.com/RSS_Cronaca.xml'),
                       (u'Economia', u'http://rss.adnkronos.com/RSS_Economia.xml'),
                       (u'Finanza', u'http://rss.adnkronos.com/RSS_Finanza.xml'),
                       (u'CyberNews', u'http://rss.adnkronos.com/RSS_CyberNews.xml'),
                       (u'Spettacolo', u'http://rss.adnkronos.com/RSS_Spettacolo.xml'),
                       (u'Cultura', u'http://rss.adnkronos.com/RSS_Cultura.xml'),
                       (u'Sport', u'http://rss.adnkronos.com/RSS_Sport.xml'),
                       (u'Sostenibilita', u'http://rss.adnkronos.com/RSS_Sostenibilita.xml'),
                       (u'Salute', u'http://rss.adnkronos.com/RSS_Salute.xml')
                      ]

Last edited by heYooh; 11-19-2014 at 11:12 AM. Reason: I made the title more specific.
heYooh is offline   Reply With Quote