This is updated feeds for India Today
Quote:
feeds = [
('Editor\'s Note','https://www.indiatoday.in/rss/1206516'),
('Cover Story', 'https://www.indiatoday.in/rss/1206509'),
('The Big Story', 'https://www.indiatoday.in/rss/1206614'),
('UP Front','https://www.indiatoday.in/rss/1206609'),
('Liesure','https://www.indiatoday.in/rss/1206551'),
('Nation', 'https://www.indiatoday.in/rss/1206514'),
('Health','https://www.indiatoday.in/rss/1206515'),
('Defence','https://www.indiatoday.in/rss/1206517'),
('Guest Column','https://www.indiatoday.in/rss/1206612'),
#('States', 'https://www.indiatoday.in/rss/1206500'),
#('Economy', 'https://www.indiatoday.in/rss/1206513'),
#('Special Report','https://www.indiatoday.in/rss/1206616'),
#('Investigation','https://www.indiatoday.in/rss/1206617'),
#('Diplomacy','https://www.indiatoday.in/rss/1206512'),
#('Sports','https://www.indiatoday.in/rss/1206518'),
]
|
I think you will be able to easily update this recipe. I tried but I don't speak this language.
For Example take this
link
Apart from h1 and h2
the whole article is present in <div class="hide" itemprop="articleBody">, which can be separated easily.
While images are located within it <div class="itgimage">