The same issue arises with Javascript-based retrieval, although the response seen is "forbidden." Of course the result is also blank, so I suspect Python is not reporting the forbidden status.
The weird thing is, if the url is constructed rater than extracted from the JSON structure, the image is retrieved successfully. I modified the economist recipe as follows.
Code:
self.cover_url = (
safe_dict(data, 'props', 'pageProps', 'content', 'cover', 'url')
.replace(
'economist.com/',
'economist.com/cdn-cgi/image/width=960,quality=80,format=auto/',
)
.replace('SQ_', '')
)
self.log('Got embedded cover:', self.cover_url)
#from datetime import datetime
#issueDate = datetime.fromisoformat(safe_dict(data, 'props', 'pageProps', 'content', 'issueDate').replace('Z', '+00:00')).strftime("%Y%m%d")
#self.cover_url = 'https://www.economist.com/cdn-cgi/image/width=960,quality=80,format=auto/content-assets/images/' + issueDate + '_DE_US.jpg'
#self.log('Got constructed cover:', self.cover_url)
As expected, the cover image does not load. However, if I uncomment the code to get the constructed url, it works, even though the urls appear to be the same. Here is the log output:
Code:
Got embedded cover: https://www.economist.com/cdn-cgi/image/width=960,quality=80,format=auto/content-assets/images/29250920_DE_US.jpg
Got constructed cover: https://www.economist.com/cdn-cgi/image/width=960,quality=80,format=auto/content-assets/images/20250920_DE_US.jpg
I get the same results using Javascript. The string lengths are the same (so there are no hidden characters corrupting the embedded url).
Very strange indeed. If anyone has a theory as to what's happening, let's hear it.