Quote:
Originally Posted by cisaak
Thanks again for the help.
Inside the first h1 tag there is:
<a title="(text of different headline)" href="/">(text of headline I want)</a>
Nothing inside the second h1 tag.
This applies to any article in the online version of the St Louis Post-Dispatch.
|
Use FireFox and FireBug to find a tag containing the <h1> tag you don't want then just use remove_tags to remove it.
It looks to me like you've got it backwards. I think you want to keep the second tag, the one without the <a> tag. The second one is the title for your article.
Try this:
Code:
remove_tags= [dict(name='div', attrs={'id':'blox-header'})]