Quote:
Originally Posted by TonytheBookworm
If your actually trying to modify the built in recipe. I do not see why. I testing it on my end and do not see after running it where any of the articles were not in print version. Also, I ran a test with print statements included and I do not see anywhere where the original url is what you stated of being changed. It appears to follow the flow that the original author of the recipe expected and looked for. In other words, kinda hard to fix something that isn't broken. <shrug>
As far as the indents you have to make sure they are spaced out correctly.
****notice the return statement is directly under the print_url statement.
|
it is giving this error
ERROR: Invalid input: <p>Could not create recipe. Error:<br>unindent does not match any outer indentation level (recipe46.py, line 51)
Code:
#!/usr/bin/env python
__license__ = 'GPL v3'
__copyright__ = '2009, Darko Miletic <darko.miletic at gmail.com>'
'''
www.business-standard.com
'''
from calibre.web.feeds.recipes import BasicNewsRecipe
class BusinessStandard(BasicNewsRecipe):
title = 'Business Standard'
__author__ = 'Darko Miletic'
description = "India's most respected business daily"
oldest_article = 7
max_articles_per_feed = 100
no_stylesheets = True
use_embedded_content = False
encoding = 'cp1252'
publisher = 'Business Standard Limited'
category = 'news, business, money, india, world'
language = 'en_IN'
conversion_options = {
'comments' : description
,'tags' : category
,'language' : language
,'publisher' : publisher
,'linearize_tables': True
}
remove_attributes=['style']
remove_tags = [dict(name=['object','link','script','iframe'])]
feeds = [
(u'Todays Newspaper' , u'http://feeds.business-standard.com/rss/paper.xml' )
,(u'Banking & finance' , u'http://feeds.business-standard.com/rss/1.xml' )
,(u'Companies & Industry', u'http://feeds.business-standard.com/rss/2.xml')
,(u'Economy & Policy' , u'http://feeds.business-standard.com/rss/3.xml' )
,(u'Opinion and analysis', u'http://feeds.business-standard.com/rss/5_0.xml')
,(u'Life & Leisure' , u'http://feeds.business-standard.com/rss/6_0.xml' )
,(u'Markets & Investing' , u'http://feeds.business-standard.com/rss/12.xml' )
,(u'Management & Mktg' , u'http://feeds.business-standard.com/rss/7_0.xml' )
,(u'Tech World',u'http://feeds.business-standard.com/rss/8_0.xml')
]
def print_version(self, url):
print 'ORG URL IS: ', url
split1 = url.split("/")
print 'THE SPLIT IS: ', split1
id = len(split1)
# we want to find the size of the array split
# because we know the id will always be in the last index
print_url = ‘http://www.business-standard.com/india/printpage.php?autono=’ + split1[id]+ ‘&tp=’
return print_url
def get_article_url(self, article):
return article.get('guid', None)
actually I am using different feeds as compared to inbuilt recipe
the rss feeds page is
http://feeds.business-standard.com/