You're right, I was looking at the css but it seems that comes from the html.
Any idea on how to parse that from the source?
The website looks like this:
Code:
<img height="0" width="414" src="">
And I would like to remove both height and width from that tag, whatever size they specify (it can vary), leaving only
I guess I coud try with soup, something like
Code:
def postprocess_html(self, soup, first_fetch):
while len(soup.find_all('width')) > 0:
soup.width.extract()
while len(soup.find_all('height')) > 0:
soup.height.extract()
return soup