Hi,
I've been using Calibre for a few years and have also used the recipe function to download news every day. Most of the recipes I use are slightly modified and based on RSS feeds.
I'm currently stumped trying to add a few articles from a HTML page to a previously populated feed. I have looked at the example
here of how to add articles and I have also looked at several existing recipes.
What I am doing is as follows:
- Use a standard RSS feed definition:
Code:
feeds = [ ('Long Reads', 'https://longreads.com/feed/'), ]
(There are in fact several other RSS feeds here, for clarity I am showing only the relevant one)
- I have created a parse_feeds function that first runs the base parse_feeds function, then loops through all the feedss/articles to checks for one particular page which is updated weekly (5 best long reads)
- It then extracts the links on this page and tries to append them to the feeds list. The code is as follows:
Code:
def parse_feeds(self):
feeds = super(LongReads, self).parse_feeds()
for articles in feeds:
section = articles.title
for article in articles:
if article.url and 'longreads.com' in article.url:
raw = browser().open_novisit(article.url).read()
soup = BeautifulSoup(raw)
newArticles = []
for item in soup.findAll('a', attrs={'target': '_blank'}):
if item.parent.name == 'h3':
newArt = {}
newArt['title'] = item.string
newArt['url'] = item['href']
newArticles.append(newArt)
feeds.append((section, newArticles))
return feeds
- An example of the page being downloaded can be seen here: https://longreads.com/2023/04/21/the...-the-week-462/
The links are extracted correctly, the issue is that I always get the error
'tuple' object has no attribute 'title'. The example I base it on is obviously old, but I also see several newer recipies where it works to use the append function for the feed.
Outputting the feeds array shows this (excerpt), so obviously the links are added incorrectly:
Code:
____________________
Title : SolarWinds: The Untold Story of the Boldest Supply-Chain Hack
URL : https://www.wired.com/story/the-untold-story-of-solarwinds-the-boldest-supply-chain-hack-ever/
Author : Kim Zetter
Summary : The attackers were i...
Date : Tue, 02 May, 2023 12:00
TOC thumb : None
Has content : False
, ('section', [{'title': '1. A Trucker’s Kidnapping, a Suspicious Ransom, and a Colorado Family’s Perilous Quest for Justice', 'url': 'https://www.5280.com/a-truckers-kidnapping-a-suspicious-ransom-and-a-colorado-familys-perilous-quest-for-justice/?src=longreads'},
I have also tried to create an array of Feed objects, when I use the append function it then complains that
'Feed' object has no attribute 'articles'.'
Grateful for any help with this, there is obviously something simple that I cannot see...