Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 11-30-2010, 04:23 AM   #1
maccs
Junior Member
maccs began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
Recipe for "Frankfurter Rundschau" (Germany)

Hi,
since calibre's built-in recipe for German newspaper Frankfurter Rundschau (fr-online.de) is broken (feed links outdated) I tried to get my feet wet with a calibre recipe.

So here goes:
Code:
#!/usr/bin/env  python

__license__            = 'GPL v3'
__copyright__          = '2010, Christian Schmitt'

'''
fr-online.de
'''

class FROnlineRecipe(BasicNewsRecipe):
  title                  = 'FR Online'
  __author__             = 'maccs'
  description            = 'Nachrichten aus D und aller Welt'
  encoding               = 'utf-8'
  publisher              = 'Druck- und Verlagshaus Frankfurt am Main GmbH'
  category               = 'news, germany, world'
  language               = 'de_DE'
  publication_type       = 'newspaper'
  use_embedded_content   = False
  remove_javascript      = True    
  no_stylesheets         = True
  oldest_article         = 1   # Increase this number if you're interested in older articles
  max_articles_per_feed  = 50  # Seems a reasonable number to me
  extra_css              = '''
                            body { font-family: "arial", "verdana", "geneva", sans-serif; font-size: 12px; margin: 0px; background-color: #ffffff;}
                            .imgSubline{background-color: #f4f4f4; font-size: 0.8em;} 
                            .p--heading-1 {font-weight: bold;} 
                            .calibre_navbar {font-size: 0.8em; font-family: "arial", "verdana", "geneva", sans-serif;}
                            '''
  remove_tags            = [dict(name='div', attrs={'id':'Logo'})]
  cover_url              = 'http://www.fr-online.de/image/view/-/1474018/data/823552/-/logo.png'
  cover_margins          = (100, 150, '#ffffff')


# Uncomment the feeds you are interested in by removing the hash sign in front of 'feeds.append'
  feeds = []
  feeds.append(('Startseite', u'http://www.fr-online.de/home/-/1472778/1472778/-/view/asFeed/-/index.xml'))
  # feeds.append(('Politik', u'http://www.fr-online.de/politik/-/1472596/1472596/-/view/asFeed/-/index.xml'))
  # feeds.append(('Meinung', u'http://www.fr-online.de/politik/meinung/-/1472602/1472602/-/view/asFeed/-/index.xml'))
  # feeds.append(('Wirtschaft', u'http://www.fr-online.de/wirtschaft/-/1472780/1472780/-/view/asFeed/-/index.xml'))
  # feeds.append(('Sport', u'http://www.fr-online.de/sport/-/1472784/1472784/-/view/asFeed/-/index.xml'))
  # feeds.append(('Eintracht Frankfurt', u'http://www.fr-online.de/sport/eintracht-frankfurt/-/1473446/1473446/-/view/asFeed/-/index.xml'))
  # feeds.append(('Kultur und Medien', u'http://www.fr-online.de/kultur/-/1472786/1472786/-/view/asFeed/-/index.xml'))
  # feeds.append(('Panorama', u'http://www.fr-online.de/panorama/-/1472782/1472782/-/view/asFeed/-/index.xml'))
  # feeds.append(('Frankfurt', u'http://www.fr-online.de/frankfurt/-/1472798/1472798/-/view/asFeed/-/index.xml'))
  # feeds.append(('Rhein-Main', u'http://www.fr-online.de/rhein-main/-/1472796/1472796/-/view/asFeed/-/index.xml'))
  # feeds.append(('Hanau', u'http://www.fr-online.de/rhein-main/hanau/-/1472866/1472866/-/view/asFeed/-/index.xml'))
  # feeds.append(('Darmstadt', u'http://www.fr-online.de/rhein-main/darmstadt/-/1472858/1472858/-/view/asFeed/-/index.xml'))
  # feeds.append(('Wiesbaden', u'http://www.fr-online.de/rhein-main/wiesbaden/-/1472860/1472860/-/view/asFeed/-/index.xml'))
  # feeds.append(('Offenbach', u'http://www.fr-online.de/rhein-main/offenbach/-/1472856/1472856/-/view/asFeed/-/index.xml'))
  # feeds.append(('Bad Homburg', u'http://www.fr-online.de/rhein-main/bad-homburg/-/1472864/1472864/-/view/asFeed/-/index.xml'))
  # feeds.append(('Digital', u'http://www.fr-online.de/digital/-/1472406/1472406/-/view/asFeed/-/index.xml'))
  # feeds.append(('Wissenschaft', u'http://www.fr-online.de/wissenschaft/-/1472788/1472788/-/view/asFeed/-/index.xml'))

  
  def print_version(self, url):
    return url.replace('index.html', 'view/printVersion/-/index.html')
This recipe works for me

If you have comments I'd be glad to hear about them.

Cheers,
maccs
maccs is offline   Reply With Quote
Old 11-30-2010, 04:55 AM   #2
miwie
Connoisseur
miwie began at the beginning.
 
Posts: 76
Karma: 12
Join Date: Nov 2010
Device: Android, PB Pro 602
Hi,

your recipe looks mostly like the one I use.
Two suggestions:

1. You shoud experiment with
Code:
masthead_url = 'http://www.fr-online.de/image/view/-/1474018/data/823552/-/logo.png'
which puts the logo above the main menu (which IMHO looks good)

2. Try out
Code:
    def get_cover_url(self):

    # True if special cover image from iPad version wanted (might not look good)
    use_ipad_cover = False;


        def_cover_url = 'http://www.fr-ipad.de/wp-content/uploads/2010/09/merkel-printtitel.jpg'

        if self.use_ipad_cover:
            cover_url = None
            index = 'http://www.fr-ipad.de/die-ausgaben/'
            soup = self.index_to_soup(index)
            link_item = soup.find('div',attrs={'class':'ngg-thumbnail'})
            if link_item:
                cover_url = link_item.img['src']
                return cover_url

        return def_cover_url
which uses (optionally) the iPad cover image as EPUB cover - though this scales not very good.
miwie is offline   Reply With Quote
Advert
Old 11-30-2010, 06:32 AM   #3
maccs
Junior Member
maccs began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
Hi,
good call with the masthead_url. It does look good

Cheers,
maccs
maccs is offline   Reply With Quote
Old 11-10-2011, 04:30 PM   #4
maccs
Junior Member
maccs began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
Bringing back an old thread :/
I just realized that the recipe I wrote by now creates awful epubs

So, by changing just a couple of lines, this makes it look a lot better.

Code:
#!/usr/bin/env  python

__license__            = 'GPL v3'
__copyright__          = '2010-2011, Christian Schmitt'

'''
fr-online.de
'''

from calibre.web.feeds.recipes import BasicNewsRecipe

class FROnlineRecipe(BasicNewsRecipe):
  title                  = 'Frankfurter Rundschau'
  __author__             = 'maccs'
  description            = 'Nachrichten aus D und aller Welt'
  encoding               = 'utf-8'
  masthead_url           = 'http://www.fr-online.de/image/view/-/1474018/data/823552/-/logo.png'
  publisher              = 'Druck- und Verlagshaus Frankfurt am Main GmbH'
  category               = 'news, germany, world'
  language               = 'de'
  publication_type       = 'newspaper'
  use_embedded_content   = False
  remove_javascript      = True
  no_stylesheets         = True
  oldest_article         = 1   # Increase this number if you're interested in older articles
  max_articles_per_feed  = 50  # Seems a reasonable number to me
  extra_css              = '''
                            body { font-family: "arial", "verdana", "geneva", sans-serif; font-size: 12px; margin: 0px; background-color: #ffffff;}
                            .imgSubline{background-color: #f4f4f4; font-size: 0.8em;}
                            .p--heading-1 {font-weight: bold;}
                            .calibre_navbar {font-size: 0.8em; font-family: "arial", "verdana", "geneva", sans-serif;}
                            '''
  keep_only_tags         = [{'class':'ArticleHeadlineH1'}, {'class':'article_text'}]
  cover_url              = 'http://www.fr-online.de/image/view/-/1474018/data/823552/-/logo.png'
  cover_margins          = (100, 150, '#ffffff')


  feeds = []
  feeds.append(('Startseite', u'http://www.fr-online.de/home/-/1472778/1472778/-/view/asFeed/-/index.xml'))
  feeds.append(('Politik', u'http://www.fr-online.de/politik/-/1472596/1472596/-/view/asFeed/-/index.xml'))
  feeds.append(('Meinung', u'http://www.fr-online.de/politik/meinung/-/1472602/1472602/-/view/asFeed/-/index.xml'))
  feeds.append(('Wirtschaft', u'http://www.fr-online.de/wirtschaft/-/1472780/1472780/-/view/asFeed/-/index.xml'))
  feeds.append(('Sport', u'http://www.fr-online.de/sport/-/1472784/1472784/-/view/asFeed/-/index.xml'))
  feeds.append(('Eintracht Frankfurt', u'http://www.fr-online.de/sport/eintracht-frankfurt/-/1473446/1473446/-/view/asFeed/-/index.xml'))
  feeds.append(('Kultur und Medien', u'http://www.fr-online.de/kultur/-/1472786/1472786/-/view/asFeed/-/index.xml'))
  feeds.append(('Panorama', u'http://www.fr-online.de/panorama/-/1472782/1472782/-/view/asFeed/-/index.xml'))
  feeds.append(('Frankfurt', u'http://www.fr-online.de/frankfurt/-/1472798/1472798/-/view/asFeed/-/index.xml'))
  feeds.append(('Rhein-Main', u'http://www.fr-online.de/rhein-main/-/1472796/1472796/-/view/asFeed/-/index.xml'))
  feeds.append(('Hanau', u'http://www.fr-online.de/rhein-main/hanau/-/1472866/1472866/-/view/asFeed/-/index.xml'))
  feeds.append(('Darmstadt', u'http://www.fr-online.de/rhein-main/darmstadt/-/1472858/1472858/-/view/asFeed/-/index.xml'))
  feeds.append(('Wiesbaden', u'http://www.fr-online.de/rhein-main/wiesbaden/-/1472860/1472860/-/view/asFeed/-/index.xml'))
  feeds.append(('Offenbach', u'http://www.fr-online.de/rhein-main/offenbach/-/1472856/1472856/-/view/asFeed/-/index.xml'))
  feeds.append(('Bad Homburg', u'http://www.fr-online.de/rhein-main/bad-homburg/-/1472864/1472864/-/view/asFeed/-/index.xml'))
  feeds.append(('Digital', u'http://www.fr-online.de/digital/-/1472406/1472406/-/view/asFeed/-/index.xml'))
  feeds.append(('Wissenschaft', u'http://www.fr-online.de/wissenschaft/-/1472788/1472788/-/view/asFeed/-/index.xml'))


  def print_version(self, url):
    return url.replace('index.html', 'view/printVersion/-/index.html')

Cheers,
maccs
maccs is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Passing parameters to recipe from "Schedule News Download" Window (e.g. for filtering oecherprinte Recipes 6 05-13-2011 11:38 AM
Frankfurter Rundschau Failed feed tuxasus Recipes 2 11-17-2010 02:30 PM
Failed feed Frankfurter Rundschau tuxasus Calibre 1 11-17-2010 11:32 AM
How to prevent recipe read "files" pdf on web rss? KRorschachZ Recipes 12 11-10-2010 02:59 PM
Calibre recipe for daily Portuguese newspaper "Correio da Manhã" jmst Recipes 2 11-01-2010 01:01 PM


All times are GMT -4. The time now is 04:01 PM.


MobileRead.com is a privately owned, operated and funded community.