View Single Post
Old 01-26-2012, 02:07 PM   #12
Divingduck
Fanatic
Divingduck is generous with chocolateDivingduck is generous with chocolateDivingduck is generous with chocolateDivingduck is generous with chocolateDivingduck is generous with chocolateDivingduck is generous with chocolateDivingduck is generous with chocolateDivingduck is generous with chocolateDivingduck is generous with chocolateDivingduck is generous with chocolateDivingduck is generous with chocolate
 
Posts: 503
Karma: 33884
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
A new update for the recipe.
Changes:
-add missing feeds for Bildung, Gesundheit and Stil
-integrate flexible cover of the day from Süddeutsche Zeitung (from Darko Miletic paid content source)
-add correct masterhead for Süddeutsche.de
- add/correct some recipe information variables

Spoiler:
Code:
# -*- coding: utf-8 -*-
__license__   = 'GPL v3'
__copyright__ = '2012, Kovid Goyal <kovid at kovidgoyal.net>' # 2012-01-26 AGe change to actual Year

'''
Fetch sueddeutsche.de
'''
from calibre.web.feeds.news import BasicNewsRecipe
class Sueddeutsche(BasicNewsRecipe):

    title                 = u'Süddeutsche.de'                 # 2012-01-26 AGe Correct Title
    description           = 'News from Germany, Access to online content' # 2012-01-26 AGe
    __author__            = 'Oliver Niesner and Armin Geller' #Update AGe 2012-01-26
    publisher             = 'Süddeutsche Zeitung'             # 2012-01-26 AGe add
    category              = 'news, politics, Germany'         # 2012-01-26 AGe add
    timefmt               = ' [%a, %d %b %Y]'                 # 2012-01-26 AGe add %a
    oldest_article        = 7
    max_articles_per_feed = 100
    language              = 'de'
    encoding              = 'utf-8'
    publication_type      = 'newspaper'                         # 2012-01-26 add
    cover_source          = 'http://www.sueddeutsche.de/verlag' # 2012-01-26 AGe add from Darko Miletic paid content source
    masthead_url          = 'http://www.sueddeutsche.de/static_assets/build/img/sdesiteheader/logo_homepage.441d531c.png' # 2012-01-26 AGe add

    use_embedded_content  = False
    no_stylesheets        = True
    remove_javascript     = True
    auto_cleanup          = True
    
    def get_cover_url(self):                                      # 2012-01-26 AGe add from Darko Miletic paid content source
      cover_source_soup = self.index_to_soup(self.cover_source)
      preview_image_div = cover_source_soup.find(attrs={'class':'preview-image'})
      return preview_image_div.div.img['src']

    feeds = [
              (u'Politik', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EPolitik%24?output=rss'),
              (u'Wirtschaft', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EWirtschaft%24?output=rss'),
              (u'Geld', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EGeld%24?output=rss'),
              (u'Kultur', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EKultur%24?output=rss'),
              (u'Sport', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5ESport%24?output=rss'),
              (u'Leben', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5ELeben%24?output=rss'),
              (u'Karriere', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EKarriere%24?output=rss'),
              (u'Bildung', u'http://rss.sueddeutsche.de/rss/bildung'),         #2012-01-26 AGe New
              (u'Gesundheit', u'http://rss.sueddeutsche.de/rss/gesundheit'),   #2012-01-26 AGe New
              (u'Stil', u'http://rss.sueddeutsche.de/rss/stil'),               #2012-01-26 AGe New
              (u'München & Region', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EMünchen&Region%24?output=rss'),
              (u'Bayern', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EBayern%24?output=rss'),
              (u'Medien', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EMedien%24?output=rss'),
              (u'Digital', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EDigital%24?output=rss'),
              (u'Auto', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EAuto%24?output=rss'),
              (u'Wissen', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EWissen%24?output=rss'),
              (u'Panorama', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EPanorama%24?output=rss'),
              (u'Reise', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EReise%24?output=rss'),
              (u'Technik', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5ETechnik%24?output=rss'), # sometimes only
              (u'Macht', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EMacht%24?output=rss'),     # sometimes only
              (u'Job', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EJob%24?output=rss'),         # sometimes only
              (u'Service', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EService%24?output=rss'), # sometimes only
              (u'Verlag', u'http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5EVerlag%24?output=rss'),   # sometimes only
              
            ]
# AGe 2011-12-16 Problem of Handling redirections solved by a solution of Recipes-Re-usable code from kiklop74.
# Feed is:                    http://suche.sueddeutsche.de/query/%23/sort/-docdatetime/drilldown/%C2%A7ressort%3A%5ESport%24?output=rss
# Article download source is: http://sz.de/1.1237295 (Ski Alpin: Der Erfolg kommt, der Trainer geht)
# Article source is:          http://www.sueddeutsche.de/sport/ski-alpin-der-erfolg-kommt-der-trainer-geht-1.1237295
# Article printversion is:    http://www.sueddeutsche.de/sport/2.220/ski-alpin-der-erfolg-kommt-der-trainer-geht-1.1237295
    def print_version(self, url):
        n_url=self.browser.open_novisit(url).geturl()
        main, sep, id = n_url.rpartition('/')
        return main + '/2.220/' + id


Hope, you like it.
DD
Attached Files
File Type: zip Suddeutsche_AGe.zip (1.5 KB, 55 views)
Divingduck is offline   Reply With Quote