Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 11-30-2010, 04:23 AM   #1
maccs
Junior Member
maccs began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
Recipe for "Frankfurter Rundschau" (Germany)

Hi,
since calibre's built-in recipe for German newspaper Frankfurter Rundschau (fr-online.de) is broken (feed links outdated) I tried to get my feet wet with a calibre recipe.

So here goes:
Code:
#!/usr/bin/env  python

__license__            = 'GPL v3'
__copyright__          = '2010, Christian Schmitt'

'''
fr-online.de
'''

class FROnlineRecipe(BasicNewsRecipe):
  title                  = 'FR Online'
  __author__             = 'maccs'
  description            = 'Nachrichten aus D und aller Welt'
  encoding               = 'utf-8'
  publisher              = 'Druck- und Verlagshaus Frankfurt am Main GmbH'
  category               = 'news, germany, world'
  language               = 'de_DE'
  publication_type       = 'newspaper'
  use_embedded_content   = False
  remove_javascript      = True    
  no_stylesheets         = True
  oldest_article         = 1   # Increase this number if you're interested in older articles
  max_articles_per_feed  = 50  # Seems a reasonable number to me
  extra_css              = '''
                            body { font-family: "arial", "verdana", "geneva", sans-serif; font-size: 12px; margin: 0px; background-color: #ffffff;}
                            .imgSubline{background-color: #f4f4f4; font-size: 0.8em;} 
                            .p--heading-1 {font-weight: bold;} 
                            .calibre_navbar {font-size: 0.8em; font-family: "arial", "verdana", "geneva", sans-serif;}
                            '''
  remove_tags            = [dict(name='div', attrs={'id':'Logo'})]
  cover_url              = 'http://www.fr-online.de/image/view/-/1474018/data/823552/-/logo.png'
  cover_margins          = (100, 150, '#ffffff')


# Uncomment the feeds you are interested in by removing the hash sign in front of 'feeds.append'
  feeds = []
  feeds.append(('Startseite', u'http://www.fr-online.de/home/-/1472778/1472778/-/view/asFeed/-/index.xml'))
  # feeds.append(('Politik', u'http://www.fr-online.de/politik/-/1472596/1472596/-/view/asFeed/-/index.xml'))
  # feeds.append(('Meinung', u'http://www.fr-online.de/politik/meinung/-/1472602/1472602/-/view/asFeed/-/index.xml'))
  # feeds.append(('Wirtschaft', u'http://www.fr-online.de/wirtschaft/-/1472780/1472780/-/view/asFeed/-/index.xml'))
  # feeds.append(('Sport', u'http://www.fr-online.de/sport/-/1472784/1472784/-/view/asFeed/-/index.xml'))
  # feeds.append(('Eintracht Frankfurt', u'http://www.fr-online.de/sport/eintracht-frankfurt/-/1473446/1473446/-/view/asFeed/-/index.xml'))
  # feeds.append(('Kultur und Medien', u'http://www.fr-online.de/kultur/-/1472786/1472786/-/view/asFeed/-/index.xml'))
  # feeds.append(('Panorama', u'http://www.fr-online.de/panorama/-/1472782/1472782/-/view/asFeed/-/index.xml'))
  # feeds.append(('Frankfurt', u'http://www.fr-online.de/frankfurt/-/1472798/1472798/-/view/asFeed/-/index.xml'))
  # feeds.append(('Rhein-Main', u'http://www.fr-online.de/rhein-main/-/1472796/1472796/-/view/asFeed/-/index.xml'))
  # feeds.append(('Hanau', u'http://www.fr-online.de/rhein-main/hanau/-/1472866/1472866/-/view/asFeed/-/index.xml'))
  # feeds.append(('Darmstadt', u'http://www.fr-online.de/rhein-main/darmstadt/-/1472858/1472858/-/view/asFeed/-/index.xml'))
  # feeds.append(('Wiesbaden', u'http://www.fr-online.de/rhein-main/wiesbaden/-/1472860/1472860/-/view/asFeed/-/index.xml'))
  # feeds.append(('Offenbach', u'http://www.fr-online.de/rhein-main/offenbach/-/1472856/1472856/-/view/asFeed/-/index.xml'))
  # feeds.append(('Bad Homburg', u'http://www.fr-online.de/rhein-main/bad-homburg/-/1472864/1472864/-/view/asFeed/-/index.xml'))
  # feeds.append(('Digital', u'http://www.fr-online.de/digital/-/1472406/1472406/-/view/asFeed/-/index.xml'))
  # feeds.append(('Wissenschaft', u'http://www.fr-online.de/wissenschaft/-/1472788/1472788/-/view/asFeed/-/index.xml'))

  
  def print_version(self, url):
    return url.replace('index.html', 'view/printVersion/-/index.html')
This recipe works for me

If you have comments I'd be glad to hear about them.

Cheers,
maccs
maccs is offline   Reply With Quote
Old 11-30-2010, 04:55 AM   #2
miwie
Connoisseur
miwie began at the beginning.
 
Posts: 76
Karma: 12
Join Date: Nov 2010
Device: Android, PB Pro 602
Hi,

your recipe looks mostly like the one I use.
Two suggestions:

1. You shoud experiment with
Code:
masthead_url = 'http://www.fr-online.de/image/view/-/1474018/data/823552/-/logo.png'
which puts the logo above the main menu (which IMHO looks good)

2. Try out
Code:
    def get_cover_url(self):

    # True if special cover image from iPad version wanted (might not look good)
    use_ipad_cover = False;


        def_cover_url = 'http://www.fr-ipad.de/wp-content/uploads/2010/09/merkel-printtitel.jpg'

        if self.use_ipad_cover:
            cover_url = None
            index = 'http://www.fr-ipad.de/die-ausgaben/'
            soup = self.index_to_soup(index)
            link_item = soup.find('div',attrs={'class':'ngg-thumbnail'})
            if link_item:
                cover_url = link_item.img['src']
                return cover_url

        return def_cover_url
which uses (optionally) the iPad cover image as EPUB cover - though this scales not very good.
miwie is offline   Reply With Quote
Advert
Old 11-30-2010, 06:32 AM   #3
maccs
Junior Member
maccs began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
Hi,
good call with the masthead_url. It does look good

Cheers,
maccs
maccs is offline   Reply With Quote
Old 11-10-2011, 04:30 PM   #4
maccs
Junior Member
maccs began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
Bringing back an old thread :/
I just realized that the recipe I wrote by now creates awful epubs

So, by changing just a couple of lines, this makes it look a lot better.

Code:
#!/usr/bin/env  python

__license__            = 'GPL v3'
__copyright__          = '2010-2011, Christian Schmitt'

'''
fr-online.de
'''

from calibre.web.feeds.recipes import BasicNewsRecipe

class FROnlineRecipe(BasicNewsRecipe):
  title                  = 'Frankfurter Rundschau'
  __author__             = 'maccs'
  description            = 'Nachrichten aus D und aller Welt'
  encoding               = 'utf-8'
  masthead_url           = 'http://www.fr-online.de/image/view/-/1474018/data/823552/-/logo.png'
  publisher              = 'Druck- und Verlagshaus Frankfurt am Main GmbH'
  category               = 'news, germany, world'
  language               = 'de'
  publication_type       = 'newspaper'
  use_embedded_content   = False
  remove_javascript      = True
  no_stylesheets         = True
  oldest_article         = 1   # Increase this number if you're interested in older articles
  max_articles_per_feed  = 50  # Seems a reasonable number to me
  extra_css              = '''
                            body { font-family: "arial", "verdana", "geneva", sans-serif; font-size: 12px; margin: 0px; background-color: #ffffff;}
                            .imgSubline{background-color: #f4f4f4; font-size: 0.8em;}
                            .p--heading-1 {font-weight: bold;}
                            .calibre_navbar {font-size: 0.8em; font-family: "arial", "verdana", "geneva", sans-serif;}
                            '''
  keep_only_tags         = [{'class':'ArticleHeadlineH1'}, {'class':'article_text'}]
  cover_url              = 'http://www.fr-online.de/image/view/-/1474018/data/823552/-/logo.png'
  cover_margins          = (100, 150, '#ffffff')


  feeds = []
  feeds.append(('Startseite', u'http://www.fr-online.de/home/-/1472778/1472778/-/view/asFeed/-/index.xml'))
  feeds.append(('Politik', u'http://www.fr-online.de/politik/-/1472596/1472596/-/view/asFeed/-/index.xml'))
  feeds.append(('Meinung', u'http://www.fr-online.de/politik/meinung/-/1472602/1472602/-/view/asFeed/-/index.xml'))
  feeds.append(('Wirtschaft', u'http://www.fr-online.de/wirtschaft/-/1472780/1472780/-/view/asFeed/-/index.xml'))
  feeds.append(('Sport', u'http://www.fr-online.de/sport/-/1472784/1472784/-/view/asFeed/-/index.xml'))
  feeds.append(('Eintracht Frankfurt', u'http://www.fr-online.de/sport/eintracht-frankfurt/-/1473446/1473446/-/view/asFeed/-/index.xml'))
  feeds.append(('Kultur und Medien', u'http://www.fr-online.de/kultur/-/1472786/1472786/-/view/asFeed/-/index.xml'))
  feeds.append(('Panorama', u'http://www.fr-online.de/panorama/-/1472782/1472782/-/view/asFeed/-/index.xml'))
  feeds.append(('Frankfurt', u'http://www.fr-online.de/frankfurt/-/1472798/1472798/-/view/asFeed/-/index.xml'))
  feeds.append(('Rhein-Main', u'http://www.fr-online.de/rhein-main/-/1472796/1472796/-/view/asFeed/-/index.xml'))
  feeds.append(('Hanau', u'http://www.fr-online.de/rhein-main/hanau/-/1472866/1472866/-/view/asFeed/-/index.xml'))
  feeds.append(('Darmstadt', u'http://www.fr-online.de/rhein-main/darmstadt/-/1472858/1472858/-/view/asFeed/-/index.xml'))
  feeds.append(('Wiesbaden', u'http://www.fr-online.de/rhein-main/wiesbaden/-/1472860/1472860/-/view/asFeed/-/index.xml'))
  feeds.append(('Offenbach', u'http://www.fr-online.de/rhein-main/offenbach/-/1472856/1472856/-/view/asFeed/-/index.xml'))
  feeds.append(('Bad Homburg', u'http://www.fr-online.de/rhein-main/bad-homburg/-/1472864/1472864/-/view/asFeed/-/index.xml'))
  feeds.append(('Digital', u'http://www.fr-online.de/digital/-/1472406/1472406/-/view/asFeed/-/index.xml'))
  feeds.append(('Wissenschaft', u'http://www.fr-online.de/wissenschaft/-/1472788/1472788/-/view/asFeed/-/index.xml'))


  def print_version(self, url):
    return url.replace('index.html', 'view/printVersion/-/index.html')

Cheers,
maccs
maccs is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Passing parameters to recipe from "Schedule News Download" Window (e.g. for filtering oecherprinte Recipes 6 05-13-2011 11:38 AM
Frankfurter Rundschau Failed feed tuxasus Recipes 2 11-17-2010 02:30 PM
Failed feed Frankfurter Rundschau tuxasus Calibre 1 11-17-2010 11:32 AM
How to prevent recipe read "files" pdf on web rss? KRorschachZ Recipes 12 11-10-2010 02:59 PM
Calibre recipe for daily Portuguese newspaper "Correio da Manhã" jmst Recipes 2 11-01-2010 01:01 PM


All times are GMT -4. The time now is 09:12 PM.


MobileRead.com is a privately owned, operated and funded community.