MobileRead Forums - View Single Post - Custom recipes (archive, read-only)

kidtwisted · 05-27-2010, 10:36 PM

Hello everyone.

I need some help with a recipe for this feed:
http://www.pcper.com/rss/articles.rss

Most of the articles span several pages, I've cleaned it up a bit but I'm not sure how to scrape the complete article from the "Click here for the Detailed Review" links. Thanks!

Here's what I have so far.

Code:

class AdvancedUserRecipe1274998412(BasicNewsRecipe):
    title = u'PC Perspective  Articles'
    description = 'PC Perspective  Articles'
    __author__ = 'KidTwisted'
    #use_embedded_content   = False
    max_articles_per_feed = 25
    oldest_article = 7
    cover_url      = 'http://www.pcper.com/site_gfx/pcpheader_02.gif'

    no_stylesheets = True
    language = 'en'

    remove_javascript = True
    conversion_options = { 'linearize_tables' : True}
   # reverse_article_order = True

    remove_tags = [dict(name='table', attrs={'class':'topwrapper'}),
                            dict(name='div', attrs={'class':'leftcatimg'}),
                            dict(name='div', attrs={'class':'navcontainer1'}),
                            dict(name='td', attrs={'class':'img3'}),
                            dict(name='div', attrs={'class':'mtbg'}),
                            dict(name='div', attrs={'class':'rightcatimg'}),
                            dict(name='td', attrs={'class':'articlelinks'}),
                            dict(id='navcontainer')]

    remove_tags_after = dict(name='div', attrs={'class':'rightcatimg'})


    feeds =  [ (u'PC Perspective Articles', u'http://www.pcper.com/rss/articles.rss') ]