Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 11-08-2010, 03:04 AM   #16
BlonG
Member
BlonG began at the beginning.
 
BlonG's Avatar
 
Posts: 15
Karma: 10
Join Date: Oct 2010
Location: Slovenia
Device: Kindle 3G
Here is the current recipe:

Code:
__license__ = 'GPL v3'
__copyright__ = '2010, BlonG'
'''
www.rtvslo.si
'''
from calibre.web.feeds.news import BasicNewsRecipe
from BeautifulSoup import BeautifulSoup

class MMCRTV(BasicNewsRecipe):
    title = u'MMC RTV Slovenija'
    __author__ = u'BlonG'
    description = u"Prvi interaktivni multimedijski portal, MMC RTV Slovenija"
    oldest_article = 3
    max_articles_per_feed = 20
    language = 'sl'
    no_stylesheets = True
    use_embedded_content = False

    cover_url = 'https://sites.google.com/site/javno2010/home/rtv_slo_cover.jpg'

    extra_css = '''
            h1{font-family:Arial,Helvetica,sans-serif; font-weight:bold;font-size:large;}
            h2{font-family:Arial,Helvetica,sans-serif; font-weight:normal;font-size:small;}
            p{font-family:Arial,Helvetica,sans-serif;font-size:small;}
            body{font-family:Helvetica,Arial,sans-serif;font-size:small;}
            '''

    html2lrf_options = ['--base-font-size', '10']

    def print_version(self, url):
            split_url = url.split("/")
            print_url = 'http://www.rtvslo.si/index.php?c_mod=news&op=print&id=' + split_url[-1]
            return print_url

    keep_only_tags = [
            dict(name='div', attrs={'class':'title'}),
            dict(name='div', attrs={'id':'newsbody'}),
            dict(name='div', attrs={'id':'newsblocks'}),
            ]
#    remove_tags=[
# 40            dict(name='div', attrs={'id':'newsblocks'}),
#            ]

    feeds = [
            (u'Slovenija', u'http://www.rtvslo.si/feeds/01.xml'),
            (u'Svet', u'http://www.rtvslo.si/feeds/02.xml'),
            (u'Evropska unija', u'http://www.rtvslo.si/feeds/16.xml'),
            (u'Gospodarstvo', u'http://www.rtvslo.si/feeds/04.xml'),
            (u'Črna kronika', u'http://www.rtvslo.si/feeds/08.xml'),
            (u'Okolje', u'http://www.rtvslo.si/feeds/12.xml'),
            (u'Znanost in tehnologija', u'http://www.rtvslo.si/feeds/09.xml'),
            (u'Zabava', u'http://www.rtvslo.si/feeds/06.xml'),
            (u'Ture avanture', u'http://www.rtvslo.si/feeds/28.xml'),
            ]

#    def preprocess_html(self, soup):
#            newsblocks = soup.find('div',attrs = ['id':'newsblocks'])
#            soup.find('div', attrs = {'id':'newsbody'}).insert(-1, newsblocks)
#            return soup
It works without last part "def preprocess...".
BlonG is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump


All times are GMT -4. The time now is 06:30 AM.


MobileRead.com is a privately owned, operated and funded community.