[Recipe Request] The Baffler

duluoz · 05-23-2019, 04:36 AM

Possible to create a recipe for The Baffler?

All issues can be found at link below. Latest is always the first on left. I had a go but could not figure out a way to download the latest, due to the URL naming scheme.

https://thebaffler.com/issues

Thanks!

Jamie

lui1 · 05-23-2019, 09:37 PM

Hello Jamie,

This recipe should do the trick. Let me know if it works.

Jose

New Recipe for The Baffler:

Code:

from calibre.web.feeds.recipes import BasicNewsRecipe
import re

def classes(classes):
    q = frozenset(classes.split(' '))
    return dict(
        attrs={'class': lambda x: x and frozenset(x.split()).intersection(q)}
    )

class TheBaffler(BasicNewsRecipe):

    title = 'The Baffler'
    __author__ = 'Jose Ortiz'
    description = ('This magazine contains left-wing criticism, cultural analysis, shorts'
                   ' stories, poems and art.  They publish six print issues annually.')
    language = 'en_US'
    encoding = 'UTF-8'
    no_javascript = True
    no_stylesheets = True

    keep_only_tags = [
        classes('header-contain entry-content')
    ]

    def parse_index(self):
        soup = self.index_to_soup('https://thebaffler.com/issues').main.article
        self.timefmt = ' [%s]' % self.tag_to_string(soup.find(**classes('date'))).strip()
        try:
            self.cover_url = re.sub(
                r'.*?url\((.*?)\).*', r'\1',
                soup.find(**classes('image-fill'))['style']).strip()
            self.log('cover_url at ', self.cover_url)
        except:
            self.log.error('Failed to download cover_url')

        soup = self.index_to_soup(soup.a['href'])

        # Extract comments from `.entry-content' and prepend to self.description
        self.description = (
            u'\n\n' + self.tag_to_string(soup.find(**classes('entry-content')))
            + u'\n\n' + self.description
        )

        ans = []

        # Articles at `.contents section .meta'
        for section in soup.find(**classes('contents'))('section'):
            current_section = self.tag_to_string(section.h2)
            self.log(current_section)
            articles = []
            for div in section(**classes('meta')):
                # Getting articles
                a = div.find(**classes('title')).a
                title = self.tag_to_string(a)
                url = a['href']
                self.log('\t', title, ' at ', url)
                desc = ''
                r = div.find(**classes('deck'))
                if r is not None:
                    desc = self.tag_to_string(r)
                articles.append(
                    {'title': title, 'url': url, 'description': desc})
            if current_section and articles:
                ans.append((current_section,articles))

        return ans

duluoz · 05-23-2019, 11:27 PM

Jose - many thanks for this. Works perfectly - very clever!
Thanks
Jamie

05-23-2019, 04:36 AM	#1
duluoz Newsbeamer dev Posts: 123 Karma: 1000 Join Date: Dec 2011 Device: Kindle Voyage	[Recipe Request] The Baffler Possible to create a recipe for The Baffler? All issues can be found at link below. Latest is always the first on left. I had a go but could not figure out a way to download the latest, due to the URL naming scheme. https://thebaffler.com/issues Thanks! Jamie

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Recipe request	sarumikio	Recipes	2	05-28-2013 07:34 AM
recipe request	polymath	Recipes	0	05-22-2013 07:09 PM
recipe request	Torx	Recipes	0	12-20-2010 09:33 AM
Request for recipe	sumper	Recipes	2	10-11-2010 03:25 AM
Recipe request please	aessedai44	Recipes	2	10-06-2010 02:07 AM

05-23-2019, 11:27 PM	#3
duluoz Newsbeamer dev Posts: 123 Karma: 1000 Join Date: Dec 2011 Device: Kindle Voyage	Jose - many thanks for this. Works perfectly - very clever! Thanks Jamie

Advert