View Single Post
Old 06-19-2012, 08:02 AM   #3
NotTaken
Connoisseur
NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.
 
Posts: 65
Karma: 4640
Join Date: Aug 2011
Device: kindle
Here is an example using spynner:

Code:
import spynner
from multiprocessing import Process, Queue


class MarketingSensoriale(BasicNewsRecipe):

    title                 = u'Marketing sensoriale'
    description           = 'Marketing Sensoriale, il Blog'
    category              = 'Blog'
    oldest_article        = 7
    max_articles_per_feed = 200
    no_stylesheets        = True
    encoding              = 'utf8'
    use_embedded_content  = False
    language              = 'it'
    remove_empty_feeds    = True
    recursions = 0
    auto_cleanup = False

    remove_tags_after    = [dict(name='div', attrs={'class':['article-footer']})]
    

    def get_article_url(self, article):
        return article.get('feedburner_origlink',  None)


    def grab(self,q,url):
        try:
            browser = spynner.Browser()
            browser.load(url)
            #10 second timeout
            browser.wait_load(10)
            q.put(browser.html) 
            browser.close()
        except:
            q.put(None)  

    def preprocess_raw_html(self, raw, url):
        q = Queue()        
        p = Process(target=self.grab, args=(q,url))
        p.start()
        html = q.get()
        return html


    feeds          = [(u'Marketing sensoriale', u'http://feeds.feedburner.com/MarketingSensoriale?format=xml')]

You need to install spynner before this will work.
NotTaken is offline   Reply With Quote