I like to read two features by John Crace, who regularly writes for the Guardian, 1. "The Politics Sketch" and 2. "Digested Week". My recipe fetches the first whenever it appears (2 or 3 times a week), but never the second. Any ideas? Is there a code I could use to download anything in the issue by John Crace for instance?
Here are the URLs:
1.
http://www.theguardian.com/politics/...cabinet-leaker
2.
https://www.theguardian.com/uk-news/...red-an-upgrade
(and by the way
https://www.theguardian.com/uk-news/2019/apr/26/all displays the requisite "Digested Week" along with other articles which DO download but are usually deleted as duplicates from eg Headlines)
I've adapted the tail end of the standard recipe like this (the dates are taken from the system on the day of download):
Code:
def parse_index(self):
feeds = self.parse_section(self.base_url)
feeds += self.parse_section(
'https://www.theguardian.com/politics/series/the-politics-sketch/'+str(now.strftime("%Y/%b/%d/all")), 'Politics - ')
feeds += self.parse_section(
'https://www.theguardian.com/uk-news/'+str(now.strftime("%Y/%b/%d/all")), 'UK News - ')
if date.today().weekday() in (5, 6):
feeds += self.parse_section('https://www.theguardian.com/theguardian/weekend', 'Weekend - ')
return feeds