Quote:
Originally Posted by bhandarisaurabh
it is giving this error
ERROR: Invalid input: <p>Could not create recipe. Error:<br>unindent does not match any outer indentation level (recipe46.py, line 51)
actually I am using different feeds as compared to inbuilt recipe
the rss feeds page is
http://feeds.business-standard.com/
|
alright, first lets not piggyback but yet make our own version since the feeds are different and all. With that being said, I had to test the code to get it correct because the index on the split is 0 based. and also the very last index was blank so even though lets say the length of the split array was 8 then the id would be in the 6th position. so i just idnum = len(split1) -2
anyway this code works.
Spoiler:
Code:
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup, re
class AdvancedUserRecipe1282101454(BasicNewsRecipe):
title = 'Business Standard modified'
language = 'en'
__author__ = 'TonytheBookworm'
description = 'Business Standard modified'
publisher = 'Business Standard'
category = ''
oldest_article = 5
max_articles_per_feed = 100
no_stylesheets = True
#extra_css = '.headline {font-size: x-large;} \n .fact { padding-top: 10pt }'
#masthead_url = 'http://gawand.org/wp-content/uploads/2010/06/ajc-logo.gif'
#keep_only_tags = [
# dict(name='div', attrs={'class':['blogEntryHeader','blogEntryContent']})
# ,dict(attrs={'id':['cxArticleText','cxArticleBodyText']})
# ]
feeds = [
(u'Todays Newspaper',u'http://feeds.business-standard.com/rss/paper.xml'),
(u'Banking & finance',u'http://feeds.business-standard.com/rss/1.xml'),
(u'Companies & Industry', u'http://feeds.business-standard.com/rss/2.xml'),
(u'Economy & Policy' , u'http://feeds.business-standard.com/rss/3.xml'),
(u'Opinion and analysis', u'http://feeds.business-standard.com/rss/5_0.xml'),
(u'Life & Leisure' , u'http://feeds.business-standard.com/rss/6_0.xml'),
(u'Markets & Investing' , u'http://feeds.business-standard.com/rss/12.xml'),
(u'Management & Mktg' , u'http://feeds.business-standard.com/rss/7_0.xml'),
(u'Tech World',u'http://feeds.business-standard.com/rss/8_0.xml'),
]
def print_version(self, url):
split1 = url.split("/")
print 'ORG URL IS: ', url
id = len(split1)-2 # had to offset it by 2 because it is 0 based and also the last index is blank
idnum = split1[id] # get the actual value of the id article
print 'the idnum is: ', idnum
print_url = 'http://www.business-standard.com/india/printpage.php?autono=' + idnum + '&tp='
print 'PRINT URL IS: ', print_url
return print_url