View Single Post
Old 10-12-2014, 11:30 PM   #2
rajil.s
Junior Member
rajil.s began at the beginning.
 
Posts: 9
Karma: 10
Join Date: May 2012
Device: Kindle Touch
I had a first stab at the recipe. Few problems with this are:
1. Some articles are spread over multiple pages. How do i get text from multiple pages and merge them together?
2. The rss section has a fixed text for each feed e.g. "Amarujala News : A Hindi News Website covers Breaking India news samachar in hindi, News Headlines in hindi from every State of India, news on business, sports, bollywood, political and more only at Amarujala.com". How do i delete this?

any pointers will be appreciated.

Code:
from calibre.web.feeds.news import BasicNewsRecipe
class AmarUjala(BasicNewsRecipe):
    feeds          = [
    (u'National News',
    u'http://www.amarujala.com/rss/national-news.xml'),
    (u'International news',
    u'http://www.amarujala.com/rss/international-news.xml'),
    (u'Sports news',
    u'http://www.amarujala.com/rss/sports-news.xml'),
    (u'Business News',
    u'http://www.amarujala.com/rss/business-news.xml'),
    (u'Technology News',
    u'http://www.amarujala.com/rss/technology-news.xml'),

    ]


    title          = u'Amar Ujala'
    masthead_url   = 'http://epaper.amarujala.com/images/header_logo.gif'
    auto_cleanup = True
    oldest_article = 2.0  # days
    use_embedded_content = False
    language                = 'hi_IN'
    publication_type        = 'newspaper'
    remove_empty_feeds = True


    no_stylesheets = True
    auto_cleanup = True
rajil.s is offline   Reply With Quote