MobileRead Forums - View Single Post - Help on a non working recipes (Sole24Ore)

bosplans · 03-29-2011, 10:31 AM

I made a nice recipe and it worked since so far.
A couple of days ago the news source changed the feed structure using feedportal as service provider.
Now my former recipes does not work anymore ;-)

I figure out the problem is they use obscure link now, so I guess it should be possible to use use the def_print option with some regex on the url. Unfortunatly I do not know reg ex!
I should be able to convert following link strutture, from:

'http://rss.feedsportal.com/c/32276/f/566660/s/13b7117a/l/0L0Silsole24ore0N0Cart0Cnotizie0C20A110E0A30E290Clampedusa0Eabitanti0Eoccu pano0Emunicipio0E11570A60Bshtml0Duuid0FAauakRKD/story01.htm

to:

'http://www.ilsole24ore.com/art/notizie/2011-03-29/lampedusa-abitanti-occupano-municipio-115706_PRN.shtml

The first part of the link is static, the dynamic part is the bold one. I know in the first link there are all the infos needed, but I can not figure out the code

Any help?

Here the former recipes:

Code:

__author__    = 'Marco Saraceno'
__copyright__ = '2010, Marco Saraceno <marcosaraceno at gmail.com>'
description   = 'Italian daily newspaper - v 1.1 (Mar14,2011)'

'''
http://www.ilsole24ore.com
'''

class IlSole24Ore(BasicNewsRecipe):
    __author__        = 'Marco Saraceno'
    description   = 'Italian financial daily newspaper'

    cover_url      = 'http://www.shopping24.ilsole24ore.com/ProductRelated/rds/img/logo_sole.gif'
    title          = u'Il Sole 24 Ore'
    publisher      = 'Gruppo editoriale GRUPPO 24ORE'
    category       = 'News, politics, culture, economy, financial, Italian'

    language       = 'it'
    timefmt        = '[%a, %d %b, %Y]'

    oldest_article = 2
    max_articles_per_feed = 100
    use_embedded_content  = False
    recursion             = 2
    extra_css      = '.headline {font-size: x-large;} \n .fact { padding-top: 10pt  }'

         
    remove_tags = [
                            dict(name='div', attrs={'class':['header','titolo']}),
                            dict(name='table', attrs={'class':['footer1024','footerdown']}),
                           ]

    feeds = [
                  (u'Notizie Italia', u'http://www.ilsole24ore.com/rss/notizie/italia.xml'),
				  (u'Notizie Europa', u'http://www.ilsole24ore.com/rss/notizie/europa.xml'),
				  (u'Notizie USA', u'http://www.ilsole24ore.com/rss/notizie/usa.xml'),
				  (u'Notizie Americhe', u'http://www.ilsole24ore.com/rss/notizie/americhe.xml'),
				  (u'Notizie Medio Oriente e Africa', u'http://www.ilsole24ore.com/rss/notizie/medio-oriente-e-africa.xml'),
				  (u'Notizie Asia e Oceania', u'http://www.ilsole24ore.com/rss/notizie/asia-e-oceania.xml'),
                  (u'Commenti', u'http://www.ilsole24ore.com/rss/commenti-e-idee.xml'),
                  (u'Norme e tributi', u'http://www.ilsole24ore.com/rss/norme-e-tributi.xml'),
                  (u'Finanza', u'http://www.ilsole24ore.com/rss/finanza-e-mercati.xml'),
                  (u'Economia', u'http://www.ilsole24ore.com/rss/economia.xml'),
                  (u'Tecnologia', u'http://www.ilsole24ore.com/rss/tecnologie.xml'),
                  (u'Cultura', u'http://www.ilsole24ore.com/rss/cultura.xml'),
                ]

	 
    def print_version(self, url):
          return url.replace('.shtml', '_PRN.shtml')

Moderator Notice
Thread closed as a duplicate of this thread:
https://www.mobileread.com/forums/sho...d.php?t=127486
Please don't create duplicate threads.