boocko
11-17-2010, 06:24 PM
Hello!
How does keep_only_tags pass parameters to soup's findAll?
Is it possible to do something as:
soup.findAll('p', limit=3)
with keep_only_tags syntax?
I want to keep only first div tag with a certain class attribute.
Starson17
11-18-2010, 11:38 AM
Is it possible to do something as:
soup.findAll('p', limit=3)
with keep_only_tags syntax?
It's an interesting question, and one I don't know the answer to.
I want to keep only first div tag with a certain class attribute.
If there's only one of them, you can define it easily without the limit, but I assume there's more than one. You can always remove_tags_after the tag of interest. Or you can preprocess_html and use findAll with the limit parameter directly.
kovidgoyal
11-18-2010, 11:50 AM
you use a dictionary to specify keep only tags, that dictionary is converted to keyword argumets and passed to findAll, so any keyword argument findAll supports, you can use.
Starson17
11-18-2010, 11:59 AM
you use a dictionary to specify keep only tags, that dictionary is converted to keyword argumets and passed to findAll
That's simple enough. Thanks for the clarification.