|
|
#1 |
|
Newsbeamer dev
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 123
Karma: 1000
Join Date: Dec 2011
Device: Kindle Voyage
|
[Recipe Request] The Baffler
Possible to create a recipe for The Baffler?
All issues can be found at link below. Latest is always the first on left. I had a go but could not figure out a way to download the latest, due to the URL naming scheme. https://thebaffler.com/issues Thanks! Jamie |
|
|
|
|
|
#2 |
|
Enthusiast
![]() Posts: 36
Karma: 10
Join Date: Dec 2017
Location: Los Angeles, CA
Device: Smart Phone
|
New Recipe for "The Baffler"
Hello Jamie,
This recipe should do the trick. Let me know if it works. Jose New Recipe for The Baffler: Code:
from calibre.web.feeds.recipes import BasicNewsRecipe
import re
def classes(classes):
q = frozenset(classes.split(' '))
return dict(
attrs={'class': lambda x: x and frozenset(x.split()).intersection(q)}
)
class TheBaffler(BasicNewsRecipe):
title = 'The Baffler'
__author__ = 'Jose Ortiz'
description = ('This magazine contains left-wing criticism, cultural analysis, shorts'
' stories, poems and art. They publish six print issues annually.')
language = 'en_US'
encoding = 'UTF-8'
no_javascript = True
no_stylesheets = True
keep_only_tags = [
classes('header-contain entry-content')
]
def parse_index(self):
soup = self.index_to_soup('https://thebaffler.com/issues').main.article
self.timefmt = ' [%s]' % self.tag_to_string(soup.find(**classes('date'))).strip()
try:
self.cover_url = re.sub(
r'.*?url\((.*?)\).*', r'\1',
soup.find(**classes('image-fill'))['style']).strip()
self.log('cover_url at ', self.cover_url)
except:
self.log.error('Failed to download cover_url')
soup = self.index_to_soup(soup.a['href'])
# Extract comments from `.entry-content' and prepend to self.description
self.description = (
u'\n\n' + self.tag_to_string(soup.find(**classes('entry-content')))
+ u'\n\n' + self.description
)
ans = []
# Articles at `.contents section .meta'
for section in soup.find(**classes('contents'))('section'):
current_section = self.tag_to_string(section.h2)
self.log(current_section)
articles = []
for div in section(**classes('meta')):
# Getting articles
a = div.find(**classes('title')).a
title = self.tag_to_string(a)
url = a['href']
self.log('\t', title, ' at ', url)
desc = ''
r = div.find(**classes('deck'))
if r is not None:
desc = self.tag_to_string(r)
articles.append(
{'title': title, 'url': url, 'description': desc})
if current_section and articles:
ans.append((current_section,articles))
return ans
|
|
|
|
| Advert | |
|
|
|
|
#3 |
|
Newsbeamer dev
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 123
Karma: 1000
Join Date: Dec 2011
Device: Kindle Voyage
|
Jose - many thanks for this. Works perfectly - very clever!
Thanks Jamie |
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Recipe request | sarumikio | Recipes | 2 | 05-28-2013 07:34 AM |
| recipe request | polymath | Recipes | 0 | 05-22-2013 07:09 PM |
| recipe request | Torx | Recipes | 0 | 12-20-2010 09:33 AM |
| Request for recipe | sumper | Recipes | 2 | 10-11-2010 03:25 AM |
| Recipe request please | aessedai44 | Recipes | 2 | 10-06-2010 02:07 AM |