India today update
https://github.com/kovidgoyal/calibr...a_today.recipe
Code:
extra_css = '[itemprop^="description"] {font-size: small; font-style: italic;}'
def get_cover_url(self):
soup = self.index_to_soup('https://www.magzter.com/IN/India-Today-Group/India-Today/News/')
for citem in soup.findAll('meta', content=lambda s: s and s.endswith('view/3.jpg')):
return citem['content']
we cant get this cover from default website
THE WEEK India
https://github.com/kovidgoyal/calibr...he_week.recipe
Cover url and other updates..
Code:
def get_cover_url(self):
soup = self.index_to_soup('https://www.magzter.com/IN/Malayala_Manorama/THE_WEEK/Business/')
for citem in soup.findAll('meta', content=lambda s: s and s.endswith('view/3.jpg')):
return citem['content']
the quality of the cover url within the present recipe is very low.
remove all from line 36-57(end) ( present recipe won't load images within text of the article) (images are within src tag)
add below
Code:
keep_only_tags = [
dict(name='h1'),
dict(name='div', attrs={'class':['article-title','article-image','articlecontentbody section']}),
]
remove_tags = [
dict(name='div', attrs={'class':'highlights section'}),
]
Financial Express
cover url
Code:
def get_cover_url(self):
soup = self.index_to_soup('https://www.magzter.com/IN/The-Indian-Express-Ltd./Financial-Express-Mumbai/Business/')
for citem in soup.findAll('meta', content=lambda s: s and s.endswith('view/3.jpg')):
return citem['content']