MobileRead Forums - View Single Post - Google Reader Recipe hack - Download all unread insted of just starred

Starson17 · 07-10-2010, 01:36 PM

Now I need some help from someone who knows something about Google Reader.

Step 1 was to retrieve the auth code. I needed to change the request a bit from the old recipe, but that was successful.

Step 2 was to use the auth code and add it in as a header to each request. That was a pain. I got the header in OK, but it wasn't authorizing. For one thing, the API specs as to the exact format weren't very clear and for another, the mechanize docs weren't very clear, and for a third, unless you ask the right way, the auth code you get in step 1 is wrong. Usually, I can watch the headers when I retrieve from a website, then just do the same thing, but this is a program interface, not a browser interface, and I don't have anything working in the browser that I can copy.

Nonetheless, I was able to get it authenticating by changing the original request and copying the header format I found in a blog.

At that point, I expected success, and I suppose I may have it, but there's no content

I can retrieve an authenticated page at:

http://www.google.com/reader/api/0/tag/list

That's where the recipe gets the feeds.

The recipe then builds URLs for a feed list. Here's some feeds that it built:

(u'National Review Blogs', 'http://www.google.com/reader/atom/user/18271557450980043868/label/National%20Review%20Blogs?n=50&xt=user/-/state/com.google/read'),
(u'North Carolina', 'http://www.google.com/reader/atom/user/18271557450980043868/label/North%20Carolina?n=50&xt=user/-/state/com.google/read'),
(u'Watts Up With That', 'http://www.google.com/reader/atom/user/18271557450980043868/label/Watts%20Up%20With%20That?n=50&xt=user/-/state/com.google/read'),

However, these don't seem to have any content. I have no idea why the recipe builds these particular URLs, and since I'm dealing with a programmatic access to the Google Reader, not a browser access, I can't copy a successful browser session. Perhaps the recipe should build some other URLs for its feeds from the contents of the page at http://www.google.com/reader/api/0/tag/list, but if so, I don't know what it should build? Maybe my test account needs to have articles tagged or marked in some special way to show up at these locations?

Alternatively, maybe I need some special cookies for the content to show up?

It would probably be a heck of a lot easier for me to just rewrite the whole dang thing from scratch trying to base it on FireFox browser access instead of this API.

If anyone knows the URL that the recipe should be building for each feed, let me know. If you want to test further, replace the get_browser in the recipe with this:

Code:

    def get_browser(self):
        br = BasicNewsRecipe.get_browser(self)
        # Print HTTP headers.
        br.set_debug_http(True)
        orig_open_novisit = br.open_novisit
        if self.username is not None and self.password is not None:
            request = urllib.urlencode([('Email', self.username), ('Passwd', self.password),
                                        ('service', 'reader'), ('accountType', 'HOSTED_OR_GOOGLE'), ('source', __appname__)])
            response = br.open('https://www.google.com/accounts/ClientLogin', request)
            auth = re.search('Auth=(\S*)', response.read()).group(1)
            def my_open_no_visit(url, **kwargs):
                req = mechanize.Request(
                    url,
                    headers = {
                        'Authorization':'GoogleLogin auth='+auth,
                        })
                return orig_open_novisit(req)
        br.open_novisit = my_open_no_visit
        return br

For all I know, it may start working, depending on how your account is set up. Alternatively, Google may have changed the way access is made, although the blog seemed to say it hasn't.

07-10-2010, 01:36 PM	#24
Starson17 Wizard Posts: 4,004 Karma: 177841 Join Date: Dec 2009 Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T	Now I need some help from someone who knows something about Google Reader. Step 1 was to retrieve the auth code. I needed to change the request a bit from the old recipe, but that was successful. Step 2 was to use the auth code and add it in as a header to each request. That was a pain. I got the header in OK, but it wasn't authorizing. For one thing, the API specs as to the exact format weren't very clear and for another, the mechanize docs weren't very clear, and for a third, unless you ask the right way, the auth code you get in step 1 is wrong. Usually, I can watch the headers when I retrieve from a website, then just do the same thing, but this is a program interface, not a browser interface, and I don't have anything working in the browser that I can copy. Nonetheless, I was able to get it authenticating by changing the original request and copying the header format I found in a blog. At that point, I expected success, and I suppose I may have it, but there's no content I can retrieve an authenticated page at: http://www.google.com/reader/api/0/tag/list That's where the recipe gets the feeds. The recipe then builds URLs for a feed list. Here's some feeds that it built: (u'National Review Blogs', 'http://www.google.com/reader/atom/user/18271557450980043868/label/National%20Review%20Blogs?n=50&xt=user/-/state/com.google/read'), (u'North Carolina', 'http://www.google.com/reader/atom/user/18271557450980043868/label/North%20Carolina?n=50&xt=user/-/state/com.google/read'), (u'Watts Up With That', 'http://www.google.com/reader/atom/user/18271557450980043868/label/Watts%20Up%20With%20That?n=50&xt=user/-/state/com.google/read'), However, these don't seem to have any content. I have no idea why the recipe builds these particular URLs, and since I'm dealing with a programmatic access to the Google Reader, not a browser access, I can't copy a successful browser session. Perhaps the recipe should build some other URLs for its feeds from the contents of the page at http://www.google.com/reader/api/0/tag/list, but if so, I don't know what it should build? Maybe my test account needs to have articles tagged or marked in some special way to show up at these locations? Alternatively, maybe I need some special cookies for the content to show up? It would probably be a heck of a lot easier for me to just rewrite the whole dang thing from scratch trying to base it on FireFox browser access instead of this API. If anyone knows the URL that the recipe should be building for each feed, let me know. If you want to test further, replace the get_browser in the recipe with this: Code: def get_browser(self): br = BasicNewsRecipe.get_browser(self) # Print HTTP headers. br.set_debug_http(True) orig_open_novisit = br.open_novisit if self.username is not None and self.password is not None: request = urllib.urlencode([('Email', self.username), ('Passwd', self.password), ('service', 'reader'), ('accountType', 'HOSTED_OR_GOOGLE'), ('source', __appname__)]) response = br.open('https://www.google.com/accounts/ClientLogin', request) auth = re.search('Auth=(\S)', response.read()).group(1) def my_open_no_visit(url, *kwargs): req = mechanize.Request( url, headers = { 'Authorization':'GoogleLogin auth='+auth, }) return orig_open_novisit(req) br.open_novisit = my_open_no_visit return br For all I know, it may start working, depending on how your account is set up. Alternatively, Google may have changed the way access is made, although the blog seemed to say it hasn't.