Hello everyone.
I need some help with a recipe for this feed:
http://www.pcper.com/rss/articles.rss
Most of the articles span several pages, I've cleaned it up a bit but I'm not sure how to scrape the complete article from the "Click here for the Detailed Review" links. Thanks!
Here's what I have so far.
Code:
class AdvancedUserRecipe1274998412(BasicNewsRecipe):
title = u'PC Perspective Articles'
description = 'PC Perspective Articles'
__author__ = 'KidTwisted'
#use_embedded_content = False
max_articles_per_feed = 25
oldest_article = 7
cover_url = 'http://www.pcper.com/site_gfx/pcpheader_02.gif'
no_stylesheets = True
language = 'en'
remove_javascript = True
conversion_options = { 'linearize_tables' : True}
# reverse_article_order = True
remove_tags = [dict(name='table', attrs={'class':'topwrapper'}),
dict(name='div', attrs={'class':'leftcatimg'}),
dict(name='div', attrs={'class':'navcontainer1'}),
dict(name='td', attrs={'class':'img3'}),
dict(name='div', attrs={'class':'mtbg'}),
dict(name='div', attrs={'class':'rightcatimg'}),
dict(name='td', attrs={'class':'articlelinks'}),
dict(id='navcontainer')]
remove_tags_after = dict(name='div', attrs={'class':'rightcatimg'})
feeds = [ (u'PC Perspective Articles', u'http://www.pcper.com/rss/articles.rss') ]