View Single Post
Old 06-21-2012, 04:04 PM   #10
NotTaken
Connoisseur
NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.NotTaken is fluent in JavaScript as well as Klingon.
 
Posts: 65
Karma: 4640
Join Date: Aug 2011
Device: kindle
Changed to jsbrowser and calibre forking (still requires a plugin - attached) :

Code:
from calibre.web.feeds.news import BasicNewsRecipe
import os
from calibre.utils.ipc.simple_worker import *
from calibre_plugins.recipe_fork_helper import wrapper
import tempfile

dummy_module = '''

import calibre.web.jsbrowser.browser as jsbrowser

def grab(url):
    browser = jsbrowser.Browser()
    #10 second timeout
    browser.visit(url, 10)
    browser.run_for_a_time(10)
    html = browser.html
    browser.close()
    return html

    '''
class MarketingSensoriale(BasicNewsRecipe):

    title                 = u'Marketing sensoriale'
    description           = 'Marketing Sensoriale, il Blog'
    category              = 'Blog'
    oldest_article        = 7
    max_articles_per_feed = 200
    no_stylesheets        = True
    encoding              = 'utf8'
    use_embedded_content  = False
    language              = 'it'
    remove_empty_feeds    = True
    recursions = 0
    auto_cleanup = False
    delay = 0.00000001

    remove_tags_after    = [dict(name='div', attrs={'class':['article-footer']})]
    

    def get_article_url(self, article):
        return article.get('feedburner_origlink',  None)

    def preprocess_raw_html(self, raw, url):
        temp_handle, temp_path = tempfile.mkstemp()
        try:
            f = os.fdopen(temp_handle,'w')
            f.write(dummy_module)
        finally:
            f.close()      
            
        result = fork_job('calibre_plugins.recipe_fork_helper','wrapper',(temp_path, 'grab',(url)))
        
        try:
            os.remove(temp_path)
        except:
            print 'could not delete temp file:' + temp_path
            
        html = result['result']
        return html


    feeds          = [(u'Marketing sensoriale', u'http://feeds.feedburner.com/MarketingSensoriale?format=xml')]
Plugin loads a module from a file containing python source and calls function given in second argument. Couldn't find a way to import the recipe over on the dark side (child process). Maybe someone knows how?

Edit: i.e. can you provide a well defined module name of the recipe directly to fork_job (so all code is contained within recipe)
Attached Files
File Type: zip fork_helper.zip (992 Bytes, 205 views)

Last edited by NotTaken; 06-21-2012 at 04:22 PM.
NotTaken is offline   Reply With Quote