Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 06-05-2011, 09:51 AM   #1
schuster
Zealot
schuster doesn't litterschuster doesn't litter
 
Posts: 116
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /K÷ln
Device: prs-650 / prs-350 /kindle 3
recipe for Heise-online - german (almost all subjects)

Code:
from calibre.web.feeds.news import BasicNewsRecipe
class AdvancedUserRecipe(BasicNewsRecipe):

    title = 'Heise-online'
    description = 'News vom Heise-Verlag'
    __author__ = 'schuster'
    use_embedded_content   = False
    language = 'de'
    oldest_article = 2
    max_articles_per_feed = 35
    rescale_images = True
    remove_empty_feeds = True
    timeout = 5
    no_stylesheets = True


    remove_tags_after = dict(name ='p', attrs={'class':'editor'})
    remove_tags = [dict(id='navi_top_container'),
                            dict(id='navi_bottom'),
                            dict(id='mitte_rechts'),
                            dict(id='navigation'),
                            dict(id='subnavi'),
                            dict(id='social_bookmarks'),
                            dict(id='permalink'),
                            dict(id='content_foren'),
                            dict(id='seiten_navi'),
                            dict(id='adbottom'),
                            dict(id='sitemap')]

    feeds =  [ 
                   ('Newsticker', 'http://www.heise.de/newsticker/heise.rdf'),
                   ('Auto', 'http://www.heise.de/autos/rss/news.rdf'),
                   ('Foto ', 'http://www.heise.de/foto/rss/news-atom.xml'),
                   ('Mac&i', 'http://www.heise.de/mac-and-i/news.rdf'),
                   ('Mobile ', 'http://www.heise.de/mobil/newsticker/heise-atom.xml'),
                   ('Netz ', 'http://www.heise.de/netze/rss/netze-atom.xml'),
                   ('Open ', 'http://www.heise.de/open/news/news-atom.xml'),
                   ('Resale ', 'http://www.heise.de/resale/rss/resale.rdf'),
                   ('Security ', 'http://www.heise.de/security/news/news-atom.xml'),
                   ('C`t', 'http://www.heise.de/ct/rss/artikel-atom.xml'),
                   ('iX', 'http://www.heise.de/ix/news/news.rdf'),
                   ('Mach-flott', 'http://www.heise.de/mach-flott/rss/mach-flott-atom.xml'),
                   ('Blog: Babel-Bulletin', 'http://www.heise.de/developer/rss/babel-bulletin/blog.rdf'),
                   ('Blog: Der Dotnet-Doktor', 'http://www.heise.de/developer/rss/dotnet-doktor/blog.rdf'),
                   ('Blog: Bernds Management-Welt', 'http://www.heise.de/developer/rss/bernds-management-welt/blog.rdf'),
                   ('Blog: IT conversation', 'http://www.heise.de/developer/rss/world-of-it/blog.rdf'),
                   ('Blog: Kais bewegtes Web', 'http://www.heise.de/developer/rss/kais-bewegtes-web/blog.rdf')
]

    def print_version(self, url):
        return url + '?view=print'
schuster is offline   Reply With Quote
Old 06-06-2011, 01:50 AM   #2
miwie
Connoisseur
miwie began at the beginning.
 
Posts: 76
Karma: 12
Join Date: Nov 2010
Device: Android, PB Pro 602
Please consider to add the following suggestions:
Code:
    masthead_url = 'http://www.heise.de/icons/ho/heise_online_logo.gif'
    publisher   = 'Heise Zeitschriften Verlag GmbH & Co. KG'
    ...
    conversion_options = {'publisher': publisher,
    ...
                         }
miwie is offline   Reply With Quote
Old 11-18-2011, 05:21 AM   #3
schuster
Zealot
schuster doesn't litterschuster doesn't litter
 
Posts: 116
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /K÷ln
Device: prs-650 / prs-350 /kindle 3
Update - cleaner content

Code:
from calibre.web.feeds.news import BasicNewsRecipe
class AdvancedUserRecipe(BasicNewsRecipe):

    title = 'Heise-online'
    description = 'News vom Heise-Verlag'
    __author__ = 'schuster'
    masthead_url = 'http://www.heise.de/icons/ho/heise_online_logo.gif'
    publisher   = 'Heise Zeitschriften Verlag GmbH & Co. KG'
    use_embedded_content   = False
    language = 'de'
    oldest_article = 2
    max_articles_per_feed = 35
    rescale_images = True
    remove_empty_feeds = True
    timeout = 5
    no_stylesheets = True


    remove_tags_after = dict(name ='p', attrs={'class':'editor'})
    remove_tags = [dict(id='navi_top_container'),
                            dict(id='navi_bottom'),
                            dict(id='mitte_rechts'),
                            dict(id='navigation'),
                            dict(id='subnavi'),
                            dict(id='social_bookmarks'),
                            dict(id='permalink'),
                            dict(id='content_foren'),
                            dict(id='seiten_navi'),
                            dict(id='adbottom'),
                            dict(id='sitemap'),
                            dict(name='div', attrs={'id':'sitemap'}),
                            dict(name='ul', attrs={'class':'erste_zeile'}),
                            dict(name='ul', attrs={'class':'zweite_zeile'}),
                            dict(name='div', attrs={'class':'navi_top_container'})]

    feeds =  [ 
                   ('Newsticker', 'http://www.heise.de/newsticker/heise.rdf'),
                   ('Auto', 'http://www.heise.de/autos/rss/news.rdf'),
                   ('Foto ', 'http://www.heise.de/foto/rss/news-atom.xml'),
                   ('Mac&i', 'http://www.heise.de/mac-and-i/news.rdf'),
                   ('Mobile ', 'http://www.heise.de/mobil/newsticker/heise-atom.xml'),
                   ('Netz ', 'http://www.heise.de/netze/rss/netze-atom.xml'),
                   ('Open ', 'http://www.heise.de/open/news/news-atom.xml'),
                   ('Resale ', 'http://www.heise.de/resale/rss/resale.rdf'),
                   ('Security ', 'http://www.heise.de/security/news/news-atom.xml'),
                   ('C`t', 'http://www.heise.de/ct/rss/artikel-atom.xml'),
                   ('iX', 'http://www.heise.de/ix/news/news.rdf'),
                   ('Mach-flott', 'http://www.heise.de/mach-flott/rss/mach-flott-atom.xml'),
                   ('Blog: Babel-Bulletin', 'http://www.heise.de/developer/rss/babel-bulletin/blog.rdf'),
                   ('Blog: Der Dotnet-Doktor', 'http://www.heise.de/developer/rss/dotnet-doktor/blog.rdf'),
                   ('Blog: Bernds Management-Welt', 'http://www.heise.de/developer/rss/bernds-management-welt/blog.rdf'),
                   ('Blog: IT conversation', 'http://www.heise.de/developer/rss/world-of-it/blog.rdf'),
                   ('Blog: Kais bewegtes Web', 'http://www.heise.de/developer/rss/kais-bewegtes-web/blog.rdf')]

    def print_version(self, url):
        return url + '?view=print'
schuster is offline   Reply With Quote
Old 12-06-2012, 01:43 PM   #4
schuster
Zealot
schuster doesn't litterschuster doesn't litter
 
Posts: 116
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /K÷ln
Device: prs-650 / prs-350 /kindle 3
Update - cleaner content from print version

Code:
from calibre.web.feeds.news import BasicNewsRecipe
class AdvancedUserRecipe(BasicNewsRecipe):

    title = 'Heise-online'
    description = 'News vom Heise-Verlag'
    __author__ = 'schuster'
    masthead_url = 'http://www.heise.de/icons/ho/heise_online_logo.gif'
    publisher   = 'Heise Zeitschriften Verlag GmbH & Co. KG'
    use_embedded_content   = False
    language = 'de'
    oldest_article = 2
    max_articles_per_feed = 35
    rescale_images = True
    remove_empty_feeds = True
    timeout = 5
    no_stylesheets = True

    keep_only_tags = [dict(name='div', attrs={'id':'mitte_news'}),
	          dict(name='div', attrs={'id':'news'}),
	          dict(name='div', attrs={'id':'print-blog'}),
                               dict(name='h1', attrs={'class':'clear'}),
                               dict(name='div', attrs={'class':'meldung_wrapper'}),
	          dict(name='div', attrs={'id':'artikel'})
]

    remove_tags = [dict(id='navi_top_container'),
                            dict(name='p', attrs={'class':'size80'})]

    feeds =  [ 
                   ('Newsticker', 'http://www.heise.de/newsticker/heise.rdf'),
                   ('Developer', 'http://www.heise.de/developer/rss/news-atom.xml'),
                   ('Foto ', 'http://www.heise.de/foto/rss/news-atom.xml'),
                   ('Mac&i', 'http://www.heise.de/mac-and-i/news.rdf'),
                   ('Mobile ', 'http://www.heise.de/mobil/newsticker/heise-atom.xml'),
                   ('Netz ', 'http://www.heise.de/netze/rss/netze-atom.xml'),
                   ('Open ', 'http://www.heise.de/open/news/news-atom.xml'),
                   ('Resale ', 'http://www.heise.de/resale/rss/resale.rdf'),
                   ('Security ', 'http://www.heise.de/security/news/news-atom.xml'),
                   ('C`t', 'http://www.heise.de/ct/rss/artikel-atom.xml'),
                   ('Hardware Hacks', 'http://www.heise.de/hardware-hacks/rss/hardware-hacks-atom.xml'),
                   ('iX', 'http://www.heise.de/ix/news/news.rdf'),
                   ('Mach-flott', 'http://www.heise.de/mach-flott/rss/mach-flott-atom.xml'),
                   ('Blog: Babel-Bulletin', 'http://www.heise.de/developer/rss/babel-bulletin/blog.rdf'),
                   ('Blog: Der Dotnet-Doktor', 'http://www.heise.de/developer/rss/dotnet-doktor/blog.rdf'),
                   ('Blog: Bernds Management-Welt', 'http://www.heise.de/developer/rss/bernds-management-welt/blog.rdf'),
                   ('Blog: IT conversation', 'http://www.heise.de/developer/rss/world-of-it/blog.rdf'),
                   ('Blog: Kais bewegtes Web', 'http://www.heise.de/developer/rss/kais-bewegtes-web/blog.rdf')
]

    def print_version(self, url):
        return url + '?view=print'
ahhhh, this is the right one now. -sorry-

Last edited by schuster; 12-07-2012 at 01:31 AM. Reason: wrong recipe posted
schuster is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
recipe for B÷rse-online.de - german schuster Recipes 2 11-29-2013 09:56 AM
recipe for Heise Newsticker - german schuster Recipes 0 05-14-2011 12:45 PM
Recipe for hungarian HVG Online ironcat Recipes 0 03-23-2011 03:34 AM
New Recipe - Wyoming Tribune Eagle Online Tegan Recipes 0 02-12-2011 01:54 PM
Aktueller Artikel auf heise online zur Buchpreisbindung Meriku Deutsches Forum 46 07-23-2009 06:24 AM


All times are GMT -4. The time now is 11:15 PM.


MobileRead.com is a privately owned, operated and funded community.