06-23-2012, 08:56 PM | #16 | |
Connoisseur
Posts: 65
Karma: 4640
Join Date: Aug 2011
Device: kindle
|
Quote:
The latest recipe won't work until next release but if you want to can use the recipe before with the fork_helper plugin. |
|
06-23-2012, 10:48 PM | #17 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Yes, it will be included in the next calibre relase. And just for completeness, here is a version that avoids double fetching the article html.
Code:
from calibre.web.feeds.news import BasicNewsRecipe from calibre.utils.ipc.simple_worker import fork_job from calibre.ptempfile import PersistentTemporaryFile js_fetcher = ''' import calibre.web.jsbrowser.browser as jsbrowser def grab(url): browser = jsbrowser.Browser() #10 second timeout browser.visit(url, 10) browser.run_for_a_time(10) html = browser.html browser.close() return html ''' class MarketingSensoriale(BasicNewsRecipe): title = u'Marketing sensoriale' description = 'Marketing Sensoriale, il Blog' category = 'Blog' oldest_article = 7 max_articles_per_feed = 200 no_stylesheets = True encoding = 'utf8' use_embedded_content = False language = 'it' remove_empty_feeds = True recursions = 0 requires_version = (0, 8, 58) auto_cleanup = False simultaneous_downloads = 1 articles_are_obfuscated = True remove_tags_after = [dict(name='div', attrs={'class':['article-footer']})] def get_article_url(self, article): return article.get('feedburner_origlink', None) def get_obfuscated_article(self, url): result = fork_job(js_fetcher, 'grab', (url,), module_is_source_code=True) html = result['result'] if isinstance(html, type(u'')): html = html.encode('utf-8') pt = PersistentTemporaryFile('.html') pt.write(html) pt.close() return pt.name feeds = [(u'Marketing sensoriale', u'http://feeds.feedburner.com/MarketingSensoriale?format=xml')] |
Advert | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
360 DRM protected prc | msca | PocketBook | 2 | 01-25-2012 06:16 AM |
Are iBook pubs protected? | mjhudston | Apple Devices | 14 | 01-01-2011 10:13 AM |
This book is protected by DRM | racsw | Calibre | 2 | 12-19-2010 12:16 AM |
Protected page | trout | Sony Reader | 6 | 07-08-2010 08:24 AM |
PDF protected by DRM, only... it's not? | pooks | Calibre | 17 | 01-30-2010 11:44 PM |