View Single Post
Old 11-01-2010, 10:26 PM   #1
Pmykland
Junior Member
Pmykland began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Nov 2010
Device: Kindle3
Need Help Minneapolis Star Tribune

Hey folks
I am trying to get the following recipe cleaned up but I am having problems with the last part

Here is the recipe
class AdvancedUserRecipe1288138924(BasicNewsRecipe):
title = u'StarTribune'
oldest_article = 7
max_articles_per_feed = 100
no_stylesheets = True
use_embedded_content = False

feeds = [(u'Nation', u'http://www.startribune.com/nation/index.rss2'), (u'Local', u'http://www.startribune.com/local/index.rss2'), (u'Sports', u'http://www.startribune.com/sports/index.rss2'), (u'Politics', u'http://www.startribune.com/politics/national/index.rss2'),(u'Entertainment', u'http://www.startribune.com/entertainment/index.rss2'),(u'Business', u'http://www.startribune.com/business/index.rss2')]
remove_javascript = True
use_embedded_content = False
no_stylesheets = True
masthead_url = 'http://stmedia.startribune.com/designimages/FlagLogo081610.png'

keep_only_tags = [
dict(name='div', attrs={'class':'columnOne'}),
]
remove_tags = [
dict(name='div', attrs={'class':'sidebar'}),
dict(name='div', attrs={'class':'pager'}),
dict(name='div', attrs={'class':'nextStoryBlock'}),
dict(name='p', attrs={'class':'sectionpath'}),



]
remove_tags_after = dict(name='h2')

I am trying to remove all of the comments blocks after the article by using remove tags after "h2". I also want to get the "h2" tags omitted, it only says "more ...News" depending on the the feed.

How can I remove the h2 tag and remove the code after the h2 tag in the same recipe?

Thanks in advance
Pmykland is offline   Reply With Quote