Thank you. I am also a bit confused about the syntax.
In your example above you used:
keep_only_tags = dict(attrs={'class':'asset story clearfix'})
I was thinking it would be something link
keep_only_tags = dict(name='article', attrs={'class':'asset story clearfix'})
as for remove tags I sometimes see:
remove_tags = [dict(name='div', attrs={'class':'advert'})]
When do you need to have name='xx' in the string and when do you not need it?
|