Quote:
Originally Posted by lordvetinari2
We are almost there.
|
Sorry, but no you're not.

The last little bit is often the hardest.
I looked at it.
Quote:
There are a few elements there (ECOSFERA_polaroid and ECOSFERA_link_rel) that I am trying to remove, but within these father elements there are child elements also using ECOSFERA_texto_01. How do I say "keep element X, as long as X is not within Y"?
|
It can be done, but not with the simple "keep" tag statements you are using. See below.
Quote:
Finally, the links on the bottom right corner under "Legislação" should not appear either. They are not in any specifically named div or table, so I do not know how to deal with them.
|
If the tags aren't labeled with class or id, etc., they can't easily be referenced for removal or to be kept. There are other ways to reference them, but now you are adding significant complexity. Basically, you use BeautifulSoup and find tags by position relative to other tags.
Read
this and
this and
this and
this.
(Particularly the last one on BeautifulSoup)