MobileRead Forums - View Single Post - Google Reader Recipe hack - Download all unread insted of just starred

rollercoaster · 01-09-2010, 12:59 PM

Hi everyone, I have been searching for a few hours and couldnt find a fix for the 'google reader shows only starred'

Finally I decided to get my hands dirty and read the Google Reader API documentation however I am new to caliber and dont know python at all but (being a .net dev) was able to make a simple modification that helped some what. here is the recipe without further ado

Google Reader: This recipe fetches from your Google Reader account unread Starred items and unread Feeds you have placed in a folder via the manage subscriptions feature.

Spoiler:

PHP Code:


			
import urllib, re, mechanize
from calibre.web.feeds.recipes import BasicNewsRecipe
from calibre import __appname__

class GoogleReader(BasicNewsRecipe):
    title   = 'Google Reader'
    description = 'This recipe fetches from your Google Reader account unread Starred items and unread Feeds you have placed in a folder via the manage subscriptions feature.'
    needs_subscription = True
    __author__ = 'davec, rollercoaster, Starson17'
    base_url = 'http://www.google.com/reader/atom/'
    oldest_article = 365
    max_articles_per_feed = 50
    get_options = '?n=%d&xt=user/-/state/com.google/read' % max_articles_per_feed
    use_embedded_content = True

    def get_browser(self):
        br = BasicNewsRecipe.get_browser(self)
        if self.username is not None and self.password is not None:
            request = urllib.urlencode([('Email', self.username), ('Passwd', self.password),
                                        ('service', 'reader'), ('accountType', 'HOSTED_OR_GOOGLE'), ('source', __appname__)])
            response = br.open('https://www.google.com/accounts/ClientLogin', request)
            auth = re.search('Auth=(\S*)', response.read()).group(1)
            cookies = mechanize.CookieJar()
            br = mechanize.build_opener(mechanize.HTTPCookieProcessor(cookies))
            br.addheaders = [('Authorization', 'GoogleLogin auth='+auth)]
        return br

    def get_feeds(self):
        feeds = []
        soup = self.index_to_soup('http://www.google.com/reader/api/0/tag/list')
        for id in soup.findAll(True, attrs={'name':['id']}):
            url = id.contents[0]
            feeds.append((re.search('/([^/]*)$', url).group(1),
                          self.base_url + urllib.quote(url.encode('utf-8')) + self.get_options))
        return feeds

Google Reader uber: Fetches all feeds from your Google Reader account including the uncategorized items.

Spoiler:

PHP Code:


			
import urllib, re, mechanize
from calibre.web.feeds.recipes import BasicNewsRecipe
from calibre import __appname__

class GoogleReaderUber(BasicNewsRecipe):
    title   = 'Google Reader uber'
    description = 'Fetches all feeds from your Google Reader account including the uncategorized items.'
    needs_subscription = True
    __author__ = 'davec, rollercoaster, Starson17'
    base_url = 'http://www.google.com/reader/atom/'
    oldest_article = 365
    max_articles_per_feed = 250
    get_options = '?n=%d&xt=user/-/state/com.google/read' % max_articles_per_feed
    use_embedded_content = True

    def get_browser(self):
        br = BasicNewsRecipe.get_browser(self)
        if self.username is not None and self.password is not None:
            request = urllib.urlencode([('Email', self.username), ('Passwd', self.password),
                                        ('service', 'reader'), ('accountType', 'HOSTED_OR_GOOGLE'), ('source', __appname__)])
            response = br.open('https://www.google.com/accounts/ClientLogin', request)
            auth = re.search('Auth=(\S*)', response.read()).group(1)
            cookies = mechanize.CookieJar()
            br = mechanize.build_opener(mechanize.HTTPCookieProcessor(cookies))
            br.addheaders = [('Authorization', 'GoogleLogin auth='+auth)]
        return br

    def get_feeds(self):
        feeds = []
        soup = self.index_to_soup('http://www.google.com/reader/api/0/tag/list')
        for id in soup.findAll(True, attrs={'name':['id']}):
            url = id.contents[0].replace('broadcast','reading-list')
            feeds.append((re.search('/([^/]*)$', url).group(1),
                          self.base_url + urllib.quote(url.encode('utf-8')) + self.get_options))
        return feeds

For a more detailed comparison see this thread. Thanks @dwanthny.

Notes:

Spoiler:

-----------------------------------------------------------------------
Update Recipe (14/07/10): Both recipes have been updated to work with the changed authentication method in Google Reader API. Thanks to Starson17.

.

01-09-2010, 12:59 PM	#1
rollercoaster Zealot Posts: 126 Karma: 1826 Join Date: Jan 2010 Device: Kindle 2	Google Reader Recipe hack - Download all unread insted of just starred Hi everyone, I have been searching for a few hours and couldnt find a fix for the 'google reader shows only starred' Finally I decided to get my hands dirty and read the Google Reader API documentation however I am new to caliber and dont know python at all but (being a .net dev) was able to make a simple modification that helped some what. here is the recipe without further ado Google Reader: This recipe fetches from your Google Reader account unread Starred items and unread Feeds you have placed in a folder via the manage subscriptions feature. Spoiler: PHP Code: import urllib, re, mechanize from calibre.web.feeds.recipes import BasicNewsRecipe from calibre import __appname__ class GoogleReader(BasicNewsRecipe): title = 'Google Reader' description = 'This recipe fetches from your Google Reader account unread Starred items and unread Feeds you have placed in a folder via the manage subscriptions feature.' needs_subscription = True __author__ = 'davec, rollercoaster, Starson17' base_url = 'http://www.google.com/reader/atom/' oldest_article = 365 max_articles_per_feed = 50 get_options = '?n=%d&xt=user/-/state/com.google/read' % max_articles_per_feed use_embedded_content = True def get_browser(self): br = BasicNewsRecipe.get_browser(self) if self.username is not None and self.password is not None: request = urllib.urlencode([('Email', self.username), ('Passwd', self.password), ('service', 'reader'), ('accountType', 'HOSTED_OR_GOOGLE'), ('source', __appname__)]) response = br.open('https://www.google.com/accounts/ClientLogin', request) auth = re.search('Auth=(\S)', response.read()).group(1) cookies = mechanize.CookieJar() br = mechanize.build_opener(mechanize.HTTPCookieProcessor(cookies)) br.addheaders = [('Authorization', 'GoogleLogin auth='+auth)] return br def get_feeds(self): feeds = [] soup = self.index_to_soup('http://www.google.com/reader/api/0/tag/list') for id in soup.findAll(True, attrs={'name':['id']}): url = id.contents[0] feeds.append((re.search('/([^/])$', url).group(1), self.base_url + urllib.quote(url.encode('utf-8')) + self.get_options)) return feeds Google Reader uber: Fetches all feeds from your Google Reader account including the uncategorized items. Spoiler: PHP Code: import urllib, re, mechanize from calibre.web.feeds.recipes import BasicNewsRecipe from calibre import __appname__ class GoogleReaderUber(BasicNewsRecipe): title = 'Google Reader uber' description = 'Fetches all feeds from your Google Reader account including the uncategorized items.' needs_subscription = True __author__ = 'davec, rollercoaster, Starson17' base_url = 'http://www.google.com/reader/atom/' oldest_article = 365 max_articles_per_feed = 250 get_options = '?n=%d&xt=user/-/state/com.google/read' % max_articles_per_feed use_embedded_content = True def get_browser(self): br = BasicNewsRecipe.get_browser(self) if self.username is not None and self.password is not None: request = urllib.urlencode([('Email', self.username), ('Passwd', self.password), ('service', 'reader'), ('accountType', 'HOSTED_OR_GOOGLE'), ('source', __appname__)]) response = br.open('https://www.google.com/accounts/ClientLogin', request) auth = re.search('Auth=(\S)', response.read()).group(1) cookies = mechanize.CookieJar() br = mechanize.build_opener(mechanize.HTTPCookieProcessor(cookies)) br.addheaders = [('Authorization', 'GoogleLogin auth='+auth)] return br def get_feeds(self): feeds = [] soup = self.index_to_soup('http://www.google.com/reader/api/0/tag/list') for id in soup.findAll(True, attrs={'name':['id']}): url = id.contents[0].replace('broadcast','reading-list') feeds.append((re.search('/([^/])$', url).group(1), self.base_url + urllib.quote(url.encode('utf-8')) + self.get_options)) return feeds For a more detailed comparison see this thread. Thanks @dwanthny. Notes: Spoiler: The only difference is in the max count and the line url = id.contents[0] --> url = id.contents[0].replace('broadcast','reading-list') For now this seems to work by replacing the starred tag listing with the wanted 'reading-list' which is all the unread posts. Technically, 'http://www.google.com/reader/api/0/tag/list' does not return all the possible tag/state values. I am sure other recipe super heroes here can make this better. There is one thing though. I set the max count to 250 but it only fetched about 57, may be a limit by google as in my google reader account there were 121 items unread. (update: fixed bu adding 'oldest_article' value) If I missed a already posted and better recipe for this please point me to it so I can benifit from it as well and dont hold back on your comments ----------------------------------------------------------------------- Update Recipe (13/07/10): The recipe has been updated to work with the changed authentication method in Google Reader API. Thanks to Starson17. @Kovid The name was changed from 'Google Reader Uber' to just 'Google reader' to merge the two similar recipes. The Uber verson can now be removed from the recipe set. ----------------------------------------------------------------------- Update Recipe (14/07/10): Both recipes have been updated to work with the changed authentication method in Google Reader API. Thanks to Starson17. . Last edited by rollercoaster; 07-15-2010 at 12:45 AM. Reason: The recipe has been updated