MobileRead Forums - View Single Post

unkn0wn · 02-13-2022, 02:11 AM

https://github.com/kovidgoyal/calibr...s/hindu.recipe

in line 80

div = soup.find('section', attrs={'id': 'section_'})

change it to

div = soup.find('section', attrs={'id': 'section_1'})

I changed it to _1 and it fetches article links from each section

It loads the whole article text but it doesn't load images like before! looks like they changed that part too.

This is the present code to fetch images (line 49)

Code:

def preprocess_html(self, soup):
        img = soup.find('img', attrs={'class': 'lead-img'})
        try:
            for i, source in enumerate(tuple(img.parent.findAll('source', srcset=True))):
                if i == 0:
                    img['src'] = source['srcset'].split()[0]
                source.extract()
        except Exception:
            pass

and this is where the image is

Code:

<img class="lead-img" src="https://www.thehindu.com/todays-paper/1cezy4/article65041871.ece/alternates/FREE_660/First-ever-wate%2BGSE9G8VFI.3.jpg.jpg" 
data-src-template="https://www.thehindu.com/todays-paper/1cezy4/article65041871.ece/alternates/FREE_660/First-ever-wate%2BGSE9G8VFI.3.jpg.jpg" 
data-original="https://www.thehindu.com/todays-paper/1cezy4/article65041871.ece/alternates/FREE_660/First-ever-wate%2BGSE9G8VFI.3.jpg.jpg" 
alt="Winged visitors making a splash at a lake in city.NAGARA GOPAL" 
title="Winged visitors making a splash at a lake in city.NAGARA GOPAL" 
data-device-variant="FREE~FREE~FREE~FREE" 
 width="100%" height="100%">

I changed srcset with data-original but it isn't working. Its a bit complex for me.