Quote:
Originally Posted by TonytheBookworm
Starson17,
I simply am not understanding it yet even after reading the documentation and some of the code you have posted. Basically I'm wondering why this will not work...
Code:
for credit_tag in soup.findAll('span', attrs={'class':['imageCredit rightFloat']}):
p = Tag(soup, 'p')
span.replaceWith(p)
I get no soup and the article is blank 
|
1) You should be doing a credit_tag.replaceWith(p) instead of span.replaceWith(p)
2) It's not sufficient to just replace your found credit_tag (the span tags) with your new Tag p. You created p from whole cloth. It has no contents, so when you did the replaceWith (or tried to) you lost everything inside the span. A Tag object is the entire tag, from <span> to </span> or <p> to </p>, with all content therebetween. Don't expect it to do a search and replace of the text string "span" with the text "p."
If you want that, similar code is often used to regex replace all <table>, <tr> and <td> with <div>. I believe that code is posted here and used in lots of (older) recipes. It's basically what linearize_tables does. Let me know if you need me to find and post it.