Pocket recipe

belano · 04-14-2013, 01:35 PM

Hi,

I've tried to use this recipe for getting articles from my pocket (formerly readitlater) account with no success

https://github.com/tbunnyman/ReadItL...in/tree/api-v3

This recipe manages to fetch the articles okay but they all contain the following text

Quote:

Looks like you're kicking it old school.
The browser version you are using is outdated and does not work with the latest version of Pocket's web app.

Is there any way to trick pocket to believe that am using a newer browser version?

I've tried manually setting the user-agent but it didn't work

Code:

kwargs['user_agent'] = 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_4; en-us) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4'
br = BasicNewsRecipe.get_browser(self)

Thanks

kovidgoyal · 04-14-2013, 01:49 PM

br = BasicNewsRecipe.get_browser(self, user_agent='whatever')

belano · 04-21-2013, 06:07 AM

Quote:

Originally Posted by kovidgoyal

br = BasicNewsRecipe.get_browser(self, user_agent='whatever')

Thanks, that did the trick.

Unfortunately, articles show up empty as the actual content seems to be obscurely fetched using ajax - it seems they don't want us to access them

kovidgoyal · 04-21-2013, 07:03 AM

You can use a javascript enabled browser to download artiles if you like, see for example the marketing_sensoriale.recipe.

Or you can just simulate AJAX requests with the normal mechanize browser, which is a little harder to implement, but for an example see the builtin metadata download plugin for bigbooksearch.com

belano · 04-21-2013, 08:10 AM

Thanks for the tip, will give it a try

belano · 04-21-2013, 10:14 AM

Hi,

I've been digging into the marketing_sensoriale.recipe, inside my custom module

Code:

js_fetcher = '''

import calibre.web.jsbrowser.browser as jsbrowser

def grab(url):
    browser = jsbrowser.Browser()
    #10 second timeout
    browser.visit(url, 10)
    browser.run_for_a_time(10)
    html = browser.html
    browser.close()
    return html

    '''

Do you think is possible to perform a login using my credentials, set the auth cookie to the browser and keep grabbing the html?

Thanks

belano · 04-21-2013, 11:16 AM

One last question, I haven't done much python so please bear with me.

In order to pass an addicional parameter to my custom module, what's the correct syntax for the fork_job call? I've tried the following but it keeps complaining

Code:

def get_obfuscated_article(self, url):
        br = self.browser
        result = fork_job(js_fetcher, 'grab', args=(url, br), module_is_source_code=True)
        html = result['result']
        if isinstance(html, type(u'')):
            html = html.encode('utf-8')
        pt = PersistentTemporaryFile('.html')
        pt.write(html)
        pt.close()
        return pt.name

Error trace

Code:

Traceback (most recent call last):
  File "/usr/lib/calibre/calibre/utils/threadpool.py", line 95, in run
    (request, request.callable(*request.args, **request.kwds))
  File "/usr/lib/calibre/calibre/web/feeds/news.py", line 1063, in fetch_obfuscated_article
    path = os.path.abspath(self.get_obfuscated_article(url))
  File "<string>", line 197, in get_obfuscated_article
  File "/usr/lib/calibre/calibre/utils/ipc/simple_worker.py", line 160, in fork_job
    abort=abort)
  File "/usr/lib/calibre/calibre/utils/ipc/simple_worker.py", line 79, in communicate
    raise WorkerError('Failed to communicate with worker process')
WorkerError: Failed to communicate with worker process

Thanks

kovidgoyal · 04-21-2013, 03:31 PM

You cannot pass the browser via fork_job, only simple data structures like lists/tuples/dictionaries/strings.

04-21-2013, 10:14 AM	#6
belano Junior Member Posts: 5 Karma: 10 Join Date: Apr 2013 Device: kindle dx	Hi, I've been digging into the marketing_sensoriale.recipe, inside my custom module Code: js_fetcher = ''' import calibre.web.jsbrowser.browser as jsbrowser def grab(url): browser = jsbrowser.Browser() #10 second timeout browser.visit(url, 10) browser.run_for_a_time(10) html = browser.html browser.close() return html ''' Do you think is possible to perform a login using my credentials, set the auth cookie to the browser and keep grabbing the html? Thanks

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Recipe works when mocked up as Python file, fails when converted to Recipe	ode	Recipes	7	09-04-2011 05:57 AM
pocket PC wm5	nasko26	General Discussions	1	02-05-2011 10:21 PM
Does anyone have Pocket Pro and Sony Pocket?	Davimee	Astak EZReader	10	05-06-2010 12:20 AM

04-14-2013, 01:49 PM	#2
kovidgoyal creator of calibre Posts: 45,981 Karma: 29579516 Join Date: Oct 2006 Location: Mumbai, India Device: Various	br = BasicNewsRecipe.get_browser(self, user_agent='whatever')

04-21-2013, 07:03 AM	#4
kovidgoyal creator of calibre Posts: 45,981 Karma: 29579516 Join Date: Oct 2006 Location: Mumbai, India Device: Various	You can use a javascript enabled browser to download artiles if you like, see for example the marketing_sensoriale.recipe. Or you can just simulate AJAX requests with the normal mechanize browser, which is a little harder to implement, but for an example see the builtin metadata download plugin for bigbooksearch.com

04-21-2013, 08:10 AM	#5
belano Junior Member Posts: 5 Karma: 10 Join Date: Apr 2013 Device: kindle dx	Thanks for the tip, will give it a try

04-21-2013, 03:31 PM	#8
kovidgoyal creator of calibre Posts: 45,981 Karma: 29579516 Join Date: Oct 2006 Location: Mumbai, India Device: Various	You cannot pass the browser via fork_job, only simple data structures like lists/tuples/dictionaries/strings.