View Single Post
Old 11-03-2010, 04:47 AM   #1
BlonG
Member
BlonG began at the beginning.
 
BlonG's Avatar
 
Posts: 15
Karma: 10
Join Date: Oct 2010
Location: Slovenia
Device: Kindle 3G
Help with print_url and/or split_url

Since I'm a newbie I try to learn by examples I find here. I created a recipe, but have a problem with "unexpected indent" error in part with print_version.

The task is (should be) simple: replace article URL with print version URL.
  • FROM: ****://www.rtvslo.si/svet/republikanci-z-vecino-v-predstavniskem-domu-senat-ostaja-demokratom/243020
  • TO: ****://www.rtvslo.si/index.php?c_mod=news&op=print&id=243020

When I try to add/update recipe, I get error mentioned above in line 56 (It's the: return print_url).

Can someone please take a look and help me out, please.

Code:
__license__ = 'GPL v3'
__copyright__ = '2010, BlonG'
'''
www.rtvslo.si
'''
from calibre.web.feeds.news import BasicNewsRecipe
class MMCRTV(BasicNewsRecipe):
  title = u'MMC RTV'
  __author__ = u'BlonG'
# 10
  description = u"Prvi interaktivni multimedijski portal, MMC RTV Slovenija"
  oldest_article = 3
  max_articles_per_feed = 20
  encoding = 'cp1250'
  language = 'sl'
  no_stylesheets = True
  use_embedded_content = False

  cover_url = 'http://img.rtvslo.si/_static/images/rtvportal_logo.png'
# 20
  extra_css = '''
	h1{font-family:Arial,Helvetica,sans-serif; font-weight:bold;font-size:large;}
	h2{font-family:Arial,Helvetica,sans-serif; font-weight:normal;font-size:small;}
	p{font-family:Arial,Helvetica,sans-serif;font-size:small;}
	body{font-family:Helvetica,Arial,sans-serif;font-size:small;}
	'''

  html2lrf_options = ['--base-font-size', '10']

# 30
# keep_only_tags = [
# 	dict(name='div', attrs={'id':'contents'}),
#	dict(name='div', attrs={'class':'entry-content'}),
#	]

#  remove_tags = [
#	dict(name='div', attrs={'class':'fb_article_top'}),
#	dict(name='div', attrs={'class':'related'}),
#	dict(name='div', attrs={'class':'fb_article_foot'}),
# 40
#	dict(name='div', attrs={'class':'spreading'}),
#	dict(name='dl', attrs={'class':'ad'}),
# 	dict(name='p', attrs={'class':'report'}),
#	dict(name='div', attrs={'class':'hfeed comments'}),
#	dict(name='dl', attrs={'id':'entryPanel'}),
#	dict(name='dl', attrs={'class':'infopush ip_wide'}),
#	dict(name='div', attrs={'class':'sidebar'}),
#	dict(name='dl', attrs={'class':'bottom'}),
#	dict(name='div', attrs={'id':'footer'}),
# 50
#	]

    def print_version(self, url):
	split_url = url.split("/")
	print_url = 'http://www.rtvslo.si/index.php?c_mod=news&op=print&id=' +  split_url[1]
	return print_url

    feeds = [
	(u'Vse novice', u'http://www.rtvslo.si/feeds/00.xml')
	,(u'Okolje', u'http://www.rtvslo.si/feeds/12.xml')
	,(u'Znanost in tehnologija', u'http://www.rtvslo.si/feeds/09.xml')
	,(u'Zabava', u'http://www.rtvslo.si/feeds/06.xml')
	]
BlonG is offline   Reply With Quote