Remove hyperlink properties from inside <i> etc
I know this:
Small piece of code to convert all links to text:
def preprocess_html(self, soup):
for alink in soup.findAll('a'):
if alink.string is not None:
tstr = alink.string
alink.replaceWith(tstr)
return soup
BUT how do you convert links to text that are hidden in 'h2', 'strong' 'i' etc
<h2>
<a href="http://www.filmcritic.com/reviews/in-theaters">In Theaters</a>
</h2>
OR like this
<a href="http://www.filmcritic.com/reviews/1937/snow-white-and-the-seven-dwarfs/"><i>Snow White</i></a>
|