Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 09-18-2012, 07:16 PM   #1
rainrdx
Connoisseur
rainrdx can differentiate black from dark navy bluerainrdx can differentiate black from dark navy bluerainrdx can differentiate black from dark navy bluerainrdx can differentiate black from dark navy bluerainrdx can differentiate black from dark navy bluerainrdx can differentiate black from dark navy bluerainrdx can differentiate black from dark navy bluerainrdx can differentiate black from dark navy bluerainrdx can differentiate black from dark navy bluerainrdx can differentiate black from dark navy bluerainrdx can differentiate black from dark navy blue
 
Posts: 55
Karma: 13316
Join Date: Jul 2012
Device: iPad
London Review of Books recipe updated

This is an update to the built-in recipe. I made some aesthetic changes.

Code:

__license__   = 'GPL v3'
__copyright__ = '2010, Darko Miletic <darko.miletic at gmail.com>'
'''
lrb.co.uk
'''
from calibre import strftime
from calibre.web.feeds.news import BasicNewsRecipe

class LondonReviewOfBooksPayed(BasicNewsRecipe):
    title                 = 'London Review of Books'
    __author__            = 'Rich Shang, Darko Miletic'
    description           = 'Subscription content. Literary review publishing essay-length book reviews and topical articles on politics, literature, history, philosophy, science and the arts by leading writers and thinkers'
    category              = 'news, literature, UK'
    publisher             = 'LRB Ltd.'
    max_articles_per_feed = 100
    language              = 'en_GB'
    no_stylesheets        = True
    delay                 = 1
    use_embedded_content  = False
    encoding              = 'utf-8'
    INDEX                 = 'http://www.lrb.co.uk'
    LOGIN                 = INDEX + '/login'
    masthead_url          = INDEX + '/assets/images/lrb_logo_big.gif'
    needs_subscription    = True
    publication_type      = 'magazine'
    extra_css             = ' body{font-family: Georgia,Palatino,"Palatino Linotype",serif} '


    def get_browser(self):
        br = BasicNewsRecipe.get_browser()
        if self.username is not None and self.password is not None:
            br.open(self.LOGIN)
            br.select_form(nr=1)
            br['username'] = self.username
            br['password'] = self.password
            br.submit()
        return br

    def parse_index(self):
        articles = []
        soup = self.index_to_soup(self.INDEX)
        cover_item = soup.find('p',attrs={'class':'cover'})
	dates = str(soup.find('span', attrs={'class':'coverdate'}))
	newdates = re.sub('\<.*\>','',re.split('<br />',dates)[1])
	self.timefmt = ' [%s]'%newdates
        lrbtitle = self.title
        if  cover_item:
            self.cover_url = re.sub('/m/','/l/',cover_item.a.img['src'])
            content = self.INDEX + cover_item.a['href']
            soup2 = self.index_to_soup(content)
            sitem = soup2.find(attrs={'class':'article-list'})
            lrbtitle = soup2.head.title.string
            for item in sitem.findAll('a',attrs={'class':'title'}):
                description = u''
                title_prefix = u''
                feed_link = item
                if feed_link.has_key('href'):
                    url   = self.INDEX + feed_link['href']
	            title_link = re.split('<br />',str(feed_link))
		    if len (title_link) > 1:
			 title = title_prefix + re.sub('\<.*\>','',title_link[0]) + ' - ' + re.sub('\<.*\>','',title_link[1])
		    else:
			 title = title_prefix + self.tag_to_string(feed_link)
		    desc = item.findNext('li')
	            if desc is not None and desc.find('cite') is not None and desc.find('ul') is None:
			description=self.tag_to_string(desc)
                    date  = strftime(self.timefmt)
                    articles.append({
                                      'title'      :title
                                     ,'date'       :date
                                     ,'url'        :url
                                     ,'description':description
                                    })
        return [(lrbtitle, articles)]

    conversion_options = {
                             'comments'  : description
                            ,'tags'      : category
                            ,'language'  : language
                            ,'publisher' : publisher
                         }

    keep_only_tags = [dict(name='div' , attrs={'class':['article-body indent','letters']})]
    remove_attributes = ['width','height']
rainrdx is offline   Reply With Quote
Old 12-25-2012, 06:11 PM   #2
Spectrum
Zealot
Spectrum will become famous soon enoughSpectrum will become famous soon enoughSpectrum will become famous soon enoughSpectrum will become famous soon enoughSpectrum will become famous soon enoughSpectrum will become famous soon enough
 
Spectrum's Avatar
 
Posts: 126
Karma: 570
Join Date: Nov 2008
Device: iPad 1 and iPad 4, KF HD 8.9"
Downloading a specific issue

Thanks!

Last edited by Spectrum; 12-27-2012 at 12:03 AM. Reason: Resolved
Spectrum is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Updated recipe for New Scientist kiavash Recipes 2 10-26-2023 02:56 PM
Updated Recipe - O Globo claviola Recipes 0 06-18-2012 06:30 PM
updated Italian recipe faber1971 Recipes 0 06-02-2012 04:39 AM
London Review of Books Blog JFS-NMF Recipes 0 01-12-2011 02:20 PM
New York Review of Books recipe broken mkgtu Calibre 4 04-17-2010 07:58 AM


All times are GMT -4. The time now is 05:08 AM.


MobileRead.com is a privately owned, operated and funded community.