View Single Post
Old 02-13-2022, 01:11 AM   #1
unkn0wn
Guru
unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.
 
Posts: 625
Karma: 85520
Join Date: May 2021
Device: kindle
The Hindu stopped working

https://github.com/kovidgoyal/calibr...s/hindu.recipe

in line 80

div = soup.find('section', attrs={'id': 'section_'})

change it to

div = soup.find('section', attrs={'id': 'section_1'})

I changed it to _1 and it fetches article links from each section

It loads the whole article text but it doesn't load images like before! looks like they changed that part too.

This is the present code to fetch images (line 49)

Code:
def preprocess_html(self, soup):
        img = soup.find('img', attrs={'class': 'lead-img'})
        try:
            for i, source in enumerate(tuple(img.parent.findAll('source', srcset=True))):
                if i == 0:
                    img['src'] = source['srcset'].split()[0]
                source.extract()
        except Exception:
            pass
and this is where the image is

Code:
<img class="lead-img" src="https://www.thehindu.com/todays-paper/1cezy4/article65041871.ece/alternates/FREE_660/First-ever-wate%2BGSE9G8VFI.3.jpg.jpg" 
data-src-template="https://www.thehindu.com/todays-paper/1cezy4/article65041871.ece/alternates/FREE_660/First-ever-wate%2BGSE9G8VFI.3.jpg.jpg" 
data-original="https://www.thehindu.com/todays-paper/1cezy4/article65041871.ece/alternates/FREE_660/First-ever-wate%2BGSE9G8VFI.3.jpg.jpg" 
alt="Winged visitors making a splash at a lake in city.NAGARA GOPAL" 
title="Winged visitors making a splash at a lake in city.NAGARA GOPAL" 
data-device-variant="FREE~FREE~FREE~FREE" 
 width="100%" height="100%">
I changed srcset with data-original but it isn't working. Its a bit complex for me.

Last edited by unkn0wn; 02-13-2022 at 01:28 AM.
unkn0wn is offline   Reply With Quote