The Spectator - how do I download back issues?

lennyw · 03-28-2017, 03:28 AM

Hi all,

I've been looking at the recipe for The Spectator, hoping there'd be a line which I could edit to download a complete back issue, but cannot find it (I have read the threads on downloading back issues of The Economist that one just needs to edit the line INDEX = 'https://www.economist.com/printedition').

On The Spectator's website each issue's index has an issue-specific url, eg:
https://www.spectator.co.uk/issues/11-march-2017/

Is there any way I can incorporate this into the feed to download specific issues?

Thanks in advance for your help
Lenny

PS Here's the recipe code from Calibre:

Spoiler:

Code:

from calibre.web.feeds.recipes import BasicNewsRecipe


def class_sel(cls):
    def f(x):
        return x and cls in x.split()
    return f


class Spectator(BasicNewsRecipe):

    title = 'Spectator Magazine'
    __author__ = 'Kovid Goyal'
    description = 'Magazine'
    language = 'en'

    no_stylesheets = True

    keep_only_tags = dict(name='div', attrs={
                          'class': ['article-header__text', 'featured-image', 'article-content']})
    remove_tags = [
        dict(name='div', attrs={'id': ['disqus_thread']}),
        dict(attrs={'class': ['middle-promo',
                              'sharing', 'mejs-player-holder']}),
        dict(name='a', onclick=lambda x: x and '__gaTracker' in x and 'outbound-article' in x),
    ]
    remove_tags_after = [
        dict(name='hr', attrs={'class': 'sticky-clear'}),
    ]

    def parse_spec_section(self, div):
        h2 = div.find('h2')
        sectitle = self.tag_to_string(h2)
        self.log('Section:', sectitle)
        articles = []
        for div in div.findAll('div', id=lambda x: x and x.startswith('post-')):
            h2 = div.find('h2', attrs={'class': class_sel('term-item__title')})
            if h2 is None:
                h2 = div.find(attrs={'class': class_sel('news-listing__title')})
            title = self.tag_to_string(h2)
            a = h2.find('a')
            url = a['href']
            desc = ''
            self.log('\tArticle:', title)
            p = div.find(attrs={'class': class_sel('term-item__excerpt')})
            if p is not None:
                desc = self.tag_to_string(p)
            articles.append({'title': title, 'url': url, 'description': desc})
        return sectitle, articles

    def parse_index(self):
        soup = self.index_to_soup('https://www.spectator.co.uk/magazine/')
        a = soup.find('a', attrs={'class': 'issue-details__cover-link'})
        self.timefmt = ' [%s]' % a['title']
        self.cover_url = a['href']
        if self.cover_url.startswith('//'):
            self.cover_url = 'http:' + self.cover_url

        feeds = []

        div = soup.find(attrs={'class': class_sel('content-area')})
        for x in div.findAll(attrs={'class': class_sel('magazine-section-holder')}):
            title, articles = self.parse_spec_section(x)
            if articles:
                feeds.append((title, articles))
        return feeds

kovidgoyal · 03-28-2017, 03:48 AM

Look at line 52 in the recipe that is where it gets the contetns from

lennyw · 03-28-2017, 04:12 PM

Terrific, thanks for that.

Just to clarify for future readers of this thread:
1. click on small arrow next to "Fetch news"
2. click on "Add a custom news source"
3. click on button at bottom "Customise builtin recipe"
4. select journal (in this case "Spectator Magazine"
5. this will open up code. Replace Line 52

Code:

soup = self.index_to_soup('https://www.spectator.co.uk/magazine/')

with, for example

Code:

soup = self.index_to_soup('https://www.spectator.co.uk/issues/25-march-2017/')

Also, to avoid confusion, change line 12 from

Code:

    title = 'Spectator Magazine'

to

Code:

    title = 'Spectator Magazine Backissue'

6. click Save
7. close Add custom news source window
8. click on small arrow next to "Fetch news"
9. click on "Schedule news download"
10. Go to "Custom on list" and select "Spectator Magazine Backissue" or whatever you called it
11. Click on download now.
12. Repeat as necessary.

03-28-2017, 04:12 PM	#3
lennyw Member Posts: 13 Karma: 10 Join Date: Oct 2011 Location: Berlin, Germany Device: Kindle 3G	Terrific, thanks for that. Just to clarify for future readers of this thread: 1. click on small arrow next to "Fetch news" 2. click on "Add a custom news source" 3. click on button at bottom "Customise builtin recipe" 4. select journal (in this case "Spectator Magazine" 5. this will open up code. Replace Line 52 Code: soup = self.index_to_soup('https://www.spectator.co.uk/magazine/') with, for example Code: soup = self.index_to_soup('https://www.spectator.co.uk/issues/25-march-2017/') Also, to avoid confusion, change line 12 from Code: title = 'Spectator Magazine' to Code: title = 'Spectator Magazine Backissue' 6. click Save 7. close Add custom news source window 8. click on small arrow next to "Fetch news" 9. click on "Schedule news download" 10. Go to "Custom on list" and select "Spectator Magazine Backissue" or whatever you called it 11. Click on download now. 12. Repeat as necessary.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Economist - Back Issues	feedANDread	Recipes	6	01-31-2017 06:35 PM
Back issues of Amazon magazines	KevinTMC	Calibre	0	01-30-2012 10:42 AM
Periodicals: Back Issues?	JayKindle	Recipes	0	09-16-2011 03:01 PM
Troubleshooting How to download back issues of my newspaper	petmul	Amazon Kindle	3	01-13-2011 10:59 AM
How to get back E-issues of Asimov etc.?	spear	Reading Recommendations	2	12-03-2010 02:56 PM

03-28-2017, 03:48 AM	#2
kovidgoyal creator of calibre Posts: 43,860 Karma: 22666666 Join Date: Oct 2006 Location: Mumbai, India Device: Various	Look at line 52 in the recipe that is where it gets the contetns from

Advert