![]() |
#1 |
Zealot
![]() ![]() Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
|
recipe for Heise-online - german (almost all subjects)
Code:
from calibre.web.feeds.news import BasicNewsRecipe class AdvancedUserRecipe(BasicNewsRecipe): title = 'Heise-online' description = 'News vom Heise-Verlag' __author__ = 'schuster' use_embedded_content = False language = 'de' oldest_article = 2 max_articles_per_feed = 35 rescale_images = True remove_empty_feeds = True timeout = 5 no_stylesheets = True remove_tags_after = dict(name ='p', attrs={'class':'editor'}) remove_tags = [dict(id='navi_top_container'), dict(id='navi_bottom'), dict(id='mitte_rechts'), dict(id='navigation'), dict(id='subnavi'), dict(id='social_bookmarks'), dict(id='permalink'), dict(id='content_foren'), dict(id='seiten_navi'), dict(id='adbottom'), dict(id='sitemap')] feeds = [ ('Newsticker', 'http://www.heise.de/newsticker/heise.rdf'), ('Auto', 'http://www.heise.de/autos/rss/news.rdf'), ('Foto ', 'http://www.heise.de/foto/rss/news-atom.xml'), ('Mac&i', 'http://www.heise.de/mac-and-i/news.rdf'), ('Mobile ', 'http://www.heise.de/mobil/newsticker/heise-atom.xml'), ('Netz ', 'http://www.heise.de/netze/rss/netze-atom.xml'), ('Open ', 'http://www.heise.de/open/news/news-atom.xml'), ('Resale ', 'http://www.heise.de/resale/rss/resale.rdf'), ('Security ', 'http://www.heise.de/security/news/news-atom.xml'), ('C`t', 'http://www.heise.de/ct/rss/artikel-atom.xml'), ('iX', 'http://www.heise.de/ix/news/news.rdf'), ('Mach-flott', 'http://www.heise.de/mach-flott/rss/mach-flott-atom.xml'), ('Blog: Babel-Bulletin', 'http://www.heise.de/developer/rss/babel-bulletin/blog.rdf'), ('Blog: Der Dotnet-Doktor', 'http://www.heise.de/developer/rss/dotnet-doktor/blog.rdf'), ('Blog: Bernds Management-Welt', 'http://www.heise.de/developer/rss/bernds-management-welt/blog.rdf'), ('Blog: IT conversation', 'http://www.heise.de/developer/rss/world-of-it/blog.rdf'), ('Blog: Kais bewegtes Web', 'http://www.heise.de/developer/rss/kais-bewegtes-web/blog.rdf') ] def print_version(self, url): return url + '?view=print' |
![]() |
![]() |
![]() |
#2 |
Connoisseur
![]() Posts: 76
Karma: 12
Join Date: Nov 2010
Device: Android, PB Pro 602
|
Please consider to add the following suggestions:
Code:
masthead_url = 'http://www.heise.de/icons/ho/heise_online_logo.gif' publisher = 'Heise Zeitschriften Verlag GmbH & Co. KG' ... conversion_options = {'publisher': publisher, ... } |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Zealot
![]() ![]() Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
|
Update - cleaner content
Code:
from calibre.web.feeds.news import BasicNewsRecipe class AdvancedUserRecipe(BasicNewsRecipe): title = 'Heise-online' description = 'News vom Heise-Verlag' __author__ = 'schuster' masthead_url = 'http://www.heise.de/icons/ho/heise_online_logo.gif' publisher = 'Heise Zeitschriften Verlag GmbH & Co. KG' use_embedded_content = False language = 'de' oldest_article = 2 max_articles_per_feed = 35 rescale_images = True remove_empty_feeds = True timeout = 5 no_stylesheets = True remove_tags_after = dict(name ='p', attrs={'class':'editor'}) remove_tags = [dict(id='navi_top_container'), dict(id='navi_bottom'), dict(id='mitte_rechts'), dict(id='navigation'), dict(id='subnavi'), dict(id='social_bookmarks'), dict(id='permalink'), dict(id='content_foren'), dict(id='seiten_navi'), dict(id='adbottom'), dict(id='sitemap'), dict(name='div', attrs={'id':'sitemap'}), dict(name='ul', attrs={'class':'erste_zeile'}), dict(name='ul', attrs={'class':'zweite_zeile'}), dict(name='div', attrs={'class':'navi_top_container'})] feeds = [ ('Newsticker', 'http://www.heise.de/newsticker/heise.rdf'), ('Auto', 'http://www.heise.de/autos/rss/news.rdf'), ('Foto ', 'http://www.heise.de/foto/rss/news-atom.xml'), ('Mac&i', 'http://www.heise.de/mac-and-i/news.rdf'), ('Mobile ', 'http://www.heise.de/mobil/newsticker/heise-atom.xml'), ('Netz ', 'http://www.heise.de/netze/rss/netze-atom.xml'), ('Open ', 'http://www.heise.de/open/news/news-atom.xml'), ('Resale ', 'http://www.heise.de/resale/rss/resale.rdf'), ('Security ', 'http://www.heise.de/security/news/news-atom.xml'), ('C`t', 'http://www.heise.de/ct/rss/artikel-atom.xml'), ('iX', 'http://www.heise.de/ix/news/news.rdf'), ('Mach-flott', 'http://www.heise.de/mach-flott/rss/mach-flott-atom.xml'), ('Blog: Babel-Bulletin', 'http://www.heise.de/developer/rss/babel-bulletin/blog.rdf'), ('Blog: Der Dotnet-Doktor', 'http://www.heise.de/developer/rss/dotnet-doktor/blog.rdf'), ('Blog: Bernds Management-Welt', 'http://www.heise.de/developer/rss/bernds-management-welt/blog.rdf'), ('Blog: IT conversation', 'http://www.heise.de/developer/rss/world-of-it/blog.rdf'), ('Blog: Kais bewegtes Web', 'http://www.heise.de/developer/rss/kais-bewegtes-web/blog.rdf')] def print_version(self, url): return url + '?view=print' |
![]() |
![]() |
![]() |
#4 |
Zealot
![]() ![]() Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
|
Update - cleaner content from print version
Code:
from calibre.web.feeds.news import BasicNewsRecipe class AdvancedUserRecipe(BasicNewsRecipe): title = 'Heise-online' description = 'News vom Heise-Verlag' __author__ = 'schuster' masthead_url = 'http://www.heise.de/icons/ho/heise_online_logo.gif' publisher = 'Heise Zeitschriften Verlag GmbH & Co. KG' use_embedded_content = False language = 'de' oldest_article = 2 max_articles_per_feed = 35 rescale_images = True remove_empty_feeds = True timeout = 5 no_stylesheets = True keep_only_tags = [dict(name='div', attrs={'id':'mitte_news'}), dict(name='div', attrs={'id':'news'}), dict(name='div', attrs={'id':'print-blog'}), dict(name='h1', attrs={'class':'clear'}), dict(name='div', attrs={'class':'meldung_wrapper'}), dict(name='div', attrs={'id':'artikel'}) ] remove_tags = [dict(id='navi_top_container'), dict(name='p', attrs={'class':'size80'})] feeds = [ ('Newsticker', 'http://www.heise.de/newsticker/heise.rdf'), ('Developer', 'http://www.heise.de/developer/rss/news-atom.xml'), ('Foto ', 'http://www.heise.de/foto/rss/news-atom.xml'), ('Mac&i', 'http://www.heise.de/mac-and-i/news.rdf'), ('Mobile ', 'http://www.heise.de/mobil/newsticker/heise-atom.xml'), ('Netz ', 'http://www.heise.de/netze/rss/netze-atom.xml'), ('Open ', 'http://www.heise.de/open/news/news-atom.xml'), ('Resale ', 'http://www.heise.de/resale/rss/resale.rdf'), ('Security ', 'http://www.heise.de/security/news/news-atom.xml'), ('C`t', 'http://www.heise.de/ct/rss/artikel-atom.xml'), ('Hardware Hacks', 'http://www.heise.de/hardware-hacks/rss/hardware-hacks-atom.xml'), ('iX', 'http://www.heise.de/ix/news/news.rdf'), ('Mach-flott', 'http://www.heise.de/mach-flott/rss/mach-flott-atom.xml'), ('Blog: Babel-Bulletin', 'http://www.heise.de/developer/rss/babel-bulletin/blog.rdf'), ('Blog: Der Dotnet-Doktor', 'http://www.heise.de/developer/rss/dotnet-doktor/blog.rdf'), ('Blog: Bernds Management-Welt', 'http://www.heise.de/developer/rss/bernds-management-welt/blog.rdf'), ('Blog: IT conversation', 'http://www.heise.de/developer/rss/world-of-it/blog.rdf'), ('Blog: Kais bewegtes Web', 'http://www.heise.de/developer/rss/kais-bewegtes-web/blog.rdf') ] def print_version(self, url): return url + '?view=print' Last edited by schuster; 12-07-2012 at 01:31 AM. Reason: wrong recipe posted |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
recipe for Börse-online.de - german | schuster | Recipes | 2 | 11-29-2013 09:56 AM |
recipe for Heise Newsticker - german | schuster | Recipes | 0 | 05-14-2011 12:45 PM |
Recipe for hungarian HVG Online | ironcat | Recipes | 0 | 03-23-2011 03:34 AM |
New Recipe - Wyoming Tribune Eagle Online | Tegan | Recipes | 0 | 02-12-2011 01:54 PM |
Aktueller Artikel auf heise online zur Buchpreisbindung | Meriku | Deutsches Forum | 46 | 07-23-2009 06:24 AM |