MobileRead Forums - View Single Post

unkn0wn · 05-01-2022, 03:43 AM

Financial Times has feeds..

and the json is also very similar to above..

Code:

import json, re
from calibre.web.feeds.news import BasicNewsRecipe

class ft(BasicNewsRecipe):
    title = 'Financial Times'
    language = 'en'
    __author__ = "Kovid Goyal"
    description = 'The Financial Times is one of the world’s leading news organisations, recognised internationally for its authority, integrity and accuracy.'
    oldest_article = 1.5
    max_articles_per_feed = 50
    no_stylesheets = True
    remove_javascript = True
    ignore_duplicate_articles = {'url'}
    remove_attributes = ['style', 'width', 'height']
    
    def get_cover_url(self):
        soup = self.index_to_soup('https://www.todayspapers.co.uk/the-financial-times-front-page-today/')
        tag = soup.find('div', attrs={'class': 'elementor-image'})
        if tag:
            self.cover_url = tag.find('img')['src']
        return getattr(self, 'cover_url', self.cover_url)
    
    feeds = [
        ('World', 'https://www.ft.com/world?format=rss'),
        ('US', 'https://www.ft.com/world?format=rss'),
        ('Companies', 'https://www.ft.com/companies?format=rss'),
        ('Tech', 'https://www.ft.com/technology?format=rss'),
        ('Markets', 'https://www.ft.com/companies?format=rss'),
        ('Climate', 'https://www.ft.com/climate-capital?format=rss'),
        ('Opinion', 'https://www.ft.com/opinion?format=rss'),
        ('Life & Arts', 'https://www.ft.com/life-arts?format=rss'),
        ('how to spend it', 'https://www.ft.com/htsi?format=rss'),
        ]
        
calibre_most_common_ua = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.87 Safari/537.36'

I tried pre_processing json stuff.. but i keep getting errors..