![]() |
#1 |
Junior Member
![]() Posts: 9
Karma: 10
Join Date: May 2023
Device: Onyx Boox Nova Air
|
Appending articles to a feed fails
Hi,
I've been using Calibre for a few years and have also used the recipe function to download news every day. Most of the recipes I use are slightly modified and based on RSS feeds. I'm currently stumped trying to add a few articles from a HTML page to a previously populated feed. I have looked at the example here of how to add articles and I have also looked at several existing recipes. What I am doing is as follows:
The links are extracted correctly, the issue is that I always get the error 'tuple' object has no attribute 'title'. The example I base it on is obviously old, but I also see several newer recipies where it works to use the append function for the feed. Outputting the feeds array shows this (excerpt), so obviously the links are added incorrectly: Code:
____________________ Title : SolarWinds: The Untold Story of the Boldest Supply-Chain Hack URL : https://www.wired.com/story/the-untold-story-of-solarwinds-the-boldest-supply-chain-hack-ever/ Author : Kim Zetter Summary : The attackers were i... Date : Tue, 02 May, 2023 12:00 TOC thumb : None Has content : False , ('section', [{'title': '1. A Trucker’s Kidnapping, a Suspicious Ransom, and a Colorado Family’s Perilous Quest for Justice', 'url': 'https://www.5280.com/a-truckers-kidnapping-a-suspicious-ransom-and-a-colorado-familys-perilous-quest-for-justice/?src=longreads'}, Grateful for any help with this, there is obviously something simple that I cannot see... |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
feeds is a list of Feed objects. The form (title, list of feeds) is used in parse_index() not parse_feeds().
|
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
Junior Member
![]() Posts: 9
Karma: 10
Join Date: May 2023
Device: Onyx Boox Nova Air
|
Quote:
![]() The code that works in the end looks like this, using the built in feeds_from_index function to create feed objects: Code:
# subclass parse_feeds and then add the links from the Long Reads HTML page to the feeds list def parse_feeds(self): feeds = super(LongReads, self).parse_feeds() # Loop through existing articles until hit on the one from Long Reads website newArticles = [] for curfeed in feeds: for a, curarticle in enumerate(curfeed.articles): # found the Long Reads page, extract links and summary using standard BS function if curarticle.url and 'longreads.com' in curarticle.url: raw = browser().open_novisit(curarticle.url).read() soup = BeautifulSoup(raw) for item in soup.findAll('a', attrs={'target': '_blank'}): if item.parent.name == 'h3': # found a link, create a new dictionary entry in basic article format and add to list newArticles.append({ 'title': item.string, 'date': date.today(), 'url': item['href'], 'description': item.parent.findNext('p').findNext('p').contents[0] }) # If there are any links, create/append a new Feed object if len(newArticles) > 0: # use built in function to create feed objects from list of dictionaries with article info newfeeds = feeds_from_index([('Long Reads', newArticles)], oldest_article=self.oldest_article, max_articles_per_feed=self.max_articles_per_feed) # add the new feed objects to existing feed list, needs to be done one by one for newfeed in newfeeds: feeds.append(newfeed) # finally delete original page as it is just a link page feeds.pop(feeds.index(curfeed)) return feeds # in case Long Reads page not downloaded we have this catch-all for returning feeds return feeds |
|
![]() |
![]() |
![]() |
Tags |
feed, parse |
Thread Tools | Search this Thread |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Appending URLs in an RSS feed | Phoebus | Recipes | 2 | 08-10-2019 03:16 PM |
Feed is titled "all articles" if only one list of articles is found | sup | Recipes | 0 | 11-30-2013 05:31 PM |
Articles repeated in different feed sections | scissors | Recipes | 8 | 10-19-2012 11:27 AM |
The Age Feed - repeat articles | Quasii | Recipes | 2 | 03-09-2011 06:38 PM |
Sorting articles of RSS feed | miwie | Recipes | 1 | 11-21-2010 01:02 AM |