MobileRead Forums - View Single Post - Custom recipes (archive, read-only)

keckx · 03-23-2009, 05:14 AM

Hi,
I just made my first receipt for www.nzz.ch
It's doing all I want, but unfortunately it's so slow... ( 56min to produce a 0.3MB ebook )

I started with the BBC receipt to do this, but I don't see, why the NZZ version should be so slow.

Here's the receipt:

Code:

#!/usr/bin/env  python
'''
nzz.ch
'''

from calibre.web.feeds.news import BasicNewsRecipe

class NewNzz(BasicNewsRecipe):
    title          = u'Neue Zuericher Zeitung'
    __author__     = 'NZZ'
    description    = 'Neue Zuericher Zeitung'
    no_stylesheets = True
    language = _('German')
    keep_only_tags = [dict(name='div', attrs={'class':'article'})]
    remove_tags_before = dict(id='article')
    remove_tags_after  = dict(id='article')
    remove_tags     = [dict(attrs={'class':['more', 'nowrap', 'footer', 'teaser', 'articleTools', 'post-tools', 'side_tool', 'nextArticleLink clearfix']}),
                       dict(id=['formSendArticle', 'footer', 'toolsRight', 'articleInline', 'navigation', 'archive', 'side_search', 'blog_sidebar', 'side_tool', 'side_index']),
                       dict(name=['script', 'noscript', 'style'])]


    feeds          = [
                      ('Top Themen', 'http://www.nzz.ch/nachrichten/startseite?rss=true'),
                      ('International', 'http://www.nzz.ch/nachrichten/international?rss=true'),
                      ('Schweiz', 'http://www.nzz.ch/nachrichten/schweiz?rss=true'),
                      ('Wirtschaft', 'http://www.nzz.ch/nachrichten/wirtschaft/aktuell?rss=true'),
                      ('Zuerich', 'http://www.nzz.ch/nachrichten/zuerich?rss=true'),
                      ('Sport', 'http://www.nzz.ch/nachrichten/sport?rss=true'),
					  ('Panorama', 'http://www.nzz.ch/nachrichten/panorama?rss=true'),          
                    ]

    def print_version(self, url):
        return url+'?printview=true'

any ideas?
Best regards
keckx