![]() |
#1 |
Junior Member
![]() Posts: 5
Karma: 10
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
|
Recipe for "Frankfurter Rundschau" (Germany)
Hi,
since calibre's built-in recipe for German newspaper Frankfurter Rundschau (fr-online.de) is broken (feed links outdated) I tried to get my feet wet with a calibre recipe. So here goes: Code:
#!/usr/bin/env python __license__ = 'GPL v3' __copyright__ = '2010, Christian Schmitt' ''' fr-online.de ''' class FROnlineRecipe(BasicNewsRecipe): title = 'FR Online' __author__ = 'maccs' description = 'Nachrichten aus D und aller Welt' encoding = 'utf-8' publisher = 'Druck- und Verlagshaus Frankfurt am Main GmbH' category = 'news, germany, world' language = 'de_DE' publication_type = 'newspaper' use_embedded_content = False remove_javascript = True no_stylesheets = True oldest_article = 1 # Increase this number if you're interested in older articles max_articles_per_feed = 50 # Seems a reasonable number to me extra_css = ''' body { font-family: "arial", "verdana", "geneva", sans-serif; font-size: 12px; margin: 0px; background-color: #ffffff;} .imgSubline{background-color: #f4f4f4; font-size: 0.8em;} .p--heading-1 {font-weight: bold;} .calibre_navbar {font-size: 0.8em; font-family: "arial", "verdana", "geneva", sans-serif;} ''' remove_tags = [dict(name='div', attrs={'id':'Logo'})] cover_url = 'http://www.fr-online.de/image/view/-/1474018/data/823552/-/logo.png' cover_margins = (100, 150, '#ffffff') # Uncomment the feeds you are interested in by removing the hash sign in front of 'feeds.append' feeds = [] feeds.append(('Startseite', u'http://www.fr-online.de/home/-/1472778/1472778/-/view/asFeed/-/index.xml')) # feeds.append(('Politik', u'http://www.fr-online.de/politik/-/1472596/1472596/-/view/asFeed/-/index.xml')) # feeds.append(('Meinung', u'http://www.fr-online.de/politik/meinung/-/1472602/1472602/-/view/asFeed/-/index.xml')) # feeds.append(('Wirtschaft', u'http://www.fr-online.de/wirtschaft/-/1472780/1472780/-/view/asFeed/-/index.xml')) # feeds.append(('Sport', u'http://www.fr-online.de/sport/-/1472784/1472784/-/view/asFeed/-/index.xml')) # feeds.append(('Eintracht Frankfurt', u'http://www.fr-online.de/sport/eintracht-frankfurt/-/1473446/1473446/-/view/asFeed/-/index.xml')) # feeds.append(('Kultur und Medien', u'http://www.fr-online.de/kultur/-/1472786/1472786/-/view/asFeed/-/index.xml')) # feeds.append(('Panorama', u'http://www.fr-online.de/panorama/-/1472782/1472782/-/view/asFeed/-/index.xml')) # feeds.append(('Frankfurt', u'http://www.fr-online.de/frankfurt/-/1472798/1472798/-/view/asFeed/-/index.xml')) # feeds.append(('Rhein-Main', u'http://www.fr-online.de/rhein-main/-/1472796/1472796/-/view/asFeed/-/index.xml')) # feeds.append(('Hanau', u'http://www.fr-online.de/rhein-main/hanau/-/1472866/1472866/-/view/asFeed/-/index.xml')) # feeds.append(('Darmstadt', u'http://www.fr-online.de/rhein-main/darmstadt/-/1472858/1472858/-/view/asFeed/-/index.xml')) # feeds.append(('Wiesbaden', u'http://www.fr-online.de/rhein-main/wiesbaden/-/1472860/1472860/-/view/asFeed/-/index.xml')) # feeds.append(('Offenbach', u'http://www.fr-online.de/rhein-main/offenbach/-/1472856/1472856/-/view/asFeed/-/index.xml')) # feeds.append(('Bad Homburg', u'http://www.fr-online.de/rhein-main/bad-homburg/-/1472864/1472864/-/view/asFeed/-/index.xml')) # feeds.append(('Digital', u'http://www.fr-online.de/digital/-/1472406/1472406/-/view/asFeed/-/index.xml')) # feeds.append(('Wissenschaft', u'http://www.fr-online.de/wissenschaft/-/1472788/1472788/-/view/asFeed/-/index.xml')) def print_version(self, url): return url.replace('index.html', 'view/printVersion/-/index.html') ![]() If you have comments I'd be glad to hear about them. Cheers, maccs |
![]() |
![]() |
![]() |
#2 |
Connoisseur
![]() Posts: 76
Karma: 12
Join Date: Nov 2010
Device: Android, PB Pro 602
|
Hi,
your recipe looks mostly like the one I use. Two suggestions: 1. You shoud experiment with Code:
masthead_url = 'http://www.fr-online.de/image/view/-/1474018/data/823552/-/logo.png' 2. Try out Code:
def get_cover_url(self): # True if special cover image from iPad version wanted (might not look good) use_ipad_cover = False; def_cover_url = 'http://www.fr-ipad.de/wp-content/uploads/2010/09/merkel-printtitel.jpg' if self.use_ipad_cover: cover_url = None index = 'http://www.fr-ipad.de/die-ausgaben/' soup = self.index_to_soup(index) link_item = soup.find('div',attrs={'class':'ngg-thumbnail'}) if link_item: cover_url = link_item.img['src'] return cover_url return def_cover_url |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Junior Member
![]() Posts: 5
Karma: 10
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
|
Hi,
good call with the masthead_url. It does look good ![]() Cheers, maccs |
![]() |
![]() |
![]() |
#4 |
Junior Member
![]() Posts: 5
Karma: 10
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
|
Bringing back an old thread :/
I just realized that the recipe I wrote by now creates awful epubs ![]() So, by changing just a couple of lines, this makes it look a lot better. Code:
#!/usr/bin/env python __license__ = 'GPL v3' __copyright__ = '2010-2011, Christian Schmitt' ''' fr-online.de ''' from calibre.web.feeds.recipes import BasicNewsRecipe class FROnlineRecipe(BasicNewsRecipe): title = 'Frankfurter Rundschau' __author__ = 'maccs' description = 'Nachrichten aus D und aller Welt' encoding = 'utf-8' masthead_url = 'http://www.fr-online.de/image/view/-/1474018/data/823552/-/logo.png' publisher = 'Druck- und Verlagshaus Frankfurt am Main GmbH' category = 'news, germany, world' language = 'de' publication_type = 'newspaper' use_embedded_content = False remove_javascript = True no_stylesheets = True oldest_article = 1 # Increase this number if you're interested in older articles max_articles_per_feed = 50 # Seems a reasonable number to me extra_css = ''' body { font-family: "arial", "verdana", "geneva", sans-serif; font-size: 12px; margin: 0px; background-color: #ffffff;} .imgSubline{background-color: #f4f4f4; font-size: 0.8em;} .p--heading-1 {font-weight: bold;} .calibre_navbar {font-size: 0.8em; font-family: "arial", "verdana", "geneva", sans-serif;} ''' keep_only_tags = [{'class':'ArticleHeadlineH1'}, {'class':'article_text'}] cover_url = 'http://www.fr-online.de/image/view/-/1474018/data/823552/-/logo.png' cover_margins = (100, 150, '#ffffff') feeds = [] feeds.append(('Startseite', u'http://www.fr-online.de/home/-/1472778/1472778/-/view/asFeed/-/index.xml')) feeds.append(('Politik', u'http://www.fr-online.de/politik/-/1472596/1472596/-/view/asFeed/-/index.xml')) feeds.append(('Meinung', u'http://www.fr-online.de/politik/meinung/-/1472602/1472602/-/view/asFeed/-/index.xml')) feeds.append(('Wirtschaft', u'http://www.fr-online.de/wirtschaft/-/1472780/1472780/-/view/asFeed/-/index.xml')) feeds.append(('Sport', u'http://www.fr-online.de/sport/-/1472784/1472784/-/view/asFeed/-/index.xml')) feeds.append(('Eintracht Frankfurt', u'http://www.fr-online.de/sport/eintracht-frankfurt/-/1473446/1473446/-/view/asFeed/-/index.xml')) feeds.append(('Kultur und Medien', u'http://www.fr-online.de/kultur/-/1472786/1472786/-/view/asFeed/-/index.xml')) feeds.append(('Panorama', u'http://www.fr-online.de/panorama/-/1472782/1472782/-/view/asFeed/-/index.xml')) feeds.append(('Frankfurt', u'http://www.fr-online.de/frankfurt/-/1472798/1472798/-/view/asFeed/-/index.xml')) feeds.append(('Rhein-Main', u'http://www.fr-online.de/rhein-main/-/1472796/1472796/-/view/asFeed/-/index.xml')) feeds.append(('Hanau', u'http://www.fr-online.de/rhein-main/hanau/-/1472866/1472866/-/view/asFeed/-/index.xml')) feeds.append(('Darmstadt', u'http://www.fr-online.de/rhein-main/darmstadt/-/1472858/1472858/-/view/asFeed/-/index.xml')) feeds.append(('Wiesbaden', u'http://www.fr-online.de/rhein-main/wiesbaden/-/1472860/1472860/-/view/asFeed/-/index.xml')) feeds.append(('Offenbach', u'http://www.fr-online.de/rhein-main/offenbach/-/1472856/1472856/-/view/asFeed/-/index.xml')) feeds.append(('Bad Homburg', u'http://www.fr-online.de/rhein-main/bad-homburg/-/1472864/1472864/-/view/asFeed/-/index.xml')) feeds.append(('Digital', u'http://www.fr-online.de/digital/-/1472406/1472406/-/view/asFeed/-/index.xml')) feeds.append(('Wissenschaft', u'http://www.fr-online.de/wissenschaft/-/1472788/1472788/-/view/asFeed/-/index.xml')) def print_version(self, url): return url.replace('index.html', 'view/printVersion/-/index.html') Cheers, maccs |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Passing parameters to recipe from "Schedule News Download" Window (e.g. for filtering | oecherprinte | Recipes | 6 | 05-13-2011 11:38 AM |
Frankfurter Rundschau Failed feed | tuxasus | Recipes | 2 | 11-17-2010 02:30 PM |
Failed feed Frankfurter Rundschau | tuxasus | Calibre | 1 | 11-17-2010 11:32 AM |
How to prevent recipe read "files" pdf on web rss? | KRorschachZ | Recipes | 12 | 11-10-2010 02:59 PM |
Calibre recipe for daily Portuguese newspaper "Correio da Manhã" | jmst | Recipes | 2 | 11-01-2010 01:01 PM |