View Single Post
Old 03-04-2011, 06:15 PM   #1
mufc
Connoisseur
mufc doesn't littermufc doesn't litter
 
Posts: 99
Karma: 170
Join Date: Nov 2010
Location: Airdrie Alberta
Device: Sony 650
Remove hyperlink properties from inside <i> etc

I know this:

Small piece of code to convert all links to text:

def preprocess_html(self, soup):
for alink in soup.findAll('a'):
if alink.string is not None:
tstr = alink.string
alink.replaceWith(tstr)
return soup

BUT how do you convert links to text that are hidden in 'h2', 'strong' 'i' etc

<h2>
<a href="http://www.filmcritic.com/reviews/in-theaters">In Theaters</a>
</h2>

OR like this

<a href="http://www.filmcritic.com/reviews/1937/snow-white-and-the-seven-dwarfs/"><i>Snow White</i></a>
mufc is offline   Reply With Quote