View Single Post
Old 10-07-2014, 07:52 AM   #4
dkfurrow
Member
dkfurrow began at the beginning.
 
Posts: 13
Karma: 10
Join Date: Jun 2013
Device: LG G-Pad 8.3
Thanks for the reply. Yeah I got similar results for my run...you're right, they must be doing some changes.

The only issue I had was the opinion section still not downloading. I know you put in a fix for that a few weeks ago, and the articles in question contain the tag with "article-contents" id, but for some reason it's not working. I tried a few other combinations of tags for the "keep-only" list, but still couldn't get it to work (except by removing keep_only, which puts too much extra stuff in). A sample parse_index function with two articles (one that works, one that doesn't) is below. Do you get similar results?

Code:
def parse_index(self):
        feeds = []
        articles = []
        # will parse
        title1 = 'HP Article WSJ'
        desc1 = 'about Hewlett Packard'
        url1 = 'http://online.wsj.com/articles/hewlett-packard-split-comes-as-more-investors-say-big-isnt-better-1412643100'
        articles.append({'title':title1, 'url':url1, 'description':desc1, 'date':''})

        # won't parse
        title = "Stephens Article in WSJ"
        desc = 'china bubble story'
        url = 'http://online.wsj.com/articles/bret-stephens-hong-kong-pops-the-china-bubble-1412636585'
        articles.append({'title':title, 'url':url, 'description':desc, 'date':''})


        for article in articles:
            print "title:", article['title']
        section = "This Sample Section"
        feeds.append((section, articles))
        return feeds
dkfurrow is offline   Reply With Quote