View Single Post
Old 02-02-2023, 11:21 PM   #10
unkn0wn
Guru
unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.
 
Posts: 625
Karma: 85520
Join Date: May 2021
Device: kindle
Quote:
Originally Posted by kovidgoyal View Post
Yeah its interesting I wonder where it got those rss feeds from. At least one of them is wrong, the features one is a 404

But here are some additions inspired by ChatGPT https://github.com/kovidgoyal/calibr...09b96a6c3eb89b
it is getting rss feeds for most websites.

it came up with the same set of instructions (with those steps) for all the recipes I asked for.

here are some of the recipes I asked it to create.
Spoiler:
write a calibre news recipe for aeon.com

Here's a Calibre news recipe for Aeon:

Title: Aeon - Ideas and Culture

Author: OpenAI

Description: A recipe for downloading the latest articles from Aeon for offline reading.

Step 1: Install Calibre

To use this recipe, you will need to have Calibre installed on your computer. Calibre is a free and open-source software that can be downloaded from the Calibre website.

Step 2: Add the recipe

In Calibre, click on the "Fetch news" button and then select "Configure news sources." In the "News Recipes" section, click the "Import" button and paste the following recipe:

python
Copy code
Code:
#!/usr/bin/env python

from calibre.web.feeds.news import BasicNewsRecipe

class Aeon(BasicNewsRecipe):
    title = 'Aeon - Ideas and Culture'
    oldest_article = 7
    use_embedded_content = False
    no_stylesheets = True
    keep_only_tags = [
        dict(name='h1', attrs={'class': 'c-article__title'}),
        dict(name='div', attrs={'class': 'c-article__body'}),
    ]
    feeds = [
        ('Latest Stories', 'https://aeon.co/feed'),
    ]
Step 3: Download the latest articles

In Calibre, click on the "Fetch news" button and select the "Aeon - Ideas and Culture" recipe. The latest articles from Aeon will be downloaded and saved to your Calibre library.

Step 4: Read the articles

You can now read the latest articles from Aeon offline using Calibre or another e-reader.

Note: This recipe was last tested on [insert date]. Aeon may make changes to their website that break this recipe, so please let me know if you encounter any issues.

Other codes generated by it

I just asked for FT feeds.. and it made a calibre recipe instead

find feeds for financial times
Code:
#!/usr/bin/env python

from calibre.web.feeds.news import BasicNewsRecipe

class FinancialTimes(BasicNewsRecipe):
    title = 'Financial Times'
    oldest_article = 7
    use_embedded_content = False
    no_stylesheets = True
    feeds = [
        ('Latest News', 'https://www.ft.com/rss/home/uk'),
        ('World News', 'https://www.ft.com/rss/world'),
        ('Companies & Markets', 'https://www.ft.com/rss/companies'),
        ('Opinion', 'https://www.ft.com/rss/comment'),
        ('Life & Arts', 'https://www.ft.com/rss/life-arts'),
    ]
Code:
#!/usr/bin/env python

from calibre.web.feeds.news import BasicNewsRecipe

class CNBC(BasicNewsRecipe):
    title = 'CNBC'
    oldest_article = 7
    use_embedded_content = False
    no_stylesheets = True
    keep_only_tags = [
        dict(name='h1', attrs={'class': 'ArticleHeader-headline'}),
        dict(name='p', attrs={'class': 'ArticleBody-articleBody'}),
    ]

    feeds = [
        ('CNBC', 'https://www.cnbc.com/id/10000108/device/rss/rss.html'),
    ]
Code:
#!/usr/bin/env python

from calibre.web.feeds.news import BasicNewsRecipe

class BostonGlobeMagazine(BasicNewsRecipe):
    title = 'Boston Globe Magazine'
    oldest_article = 7
    use_embedded_content = False
    no_stylesheets = True
    keep_only_tags = [
        dict(name='div', attrs={'class': 'col-lg-9'}),
        dict(name='div', attrs={'class': 'grid-post-headline-container'}),
        dict(name='h2'),
        dict(name='p'),
    ]
    remove_tags = [
        dict(name='div', attrs={'class': 'grid-post-image-container'}),
        dict(name='div', attrs={'class': 'grid-post-meta-container'}),
    ]

    feeds = [
        ('Boston Globe Magazine', 'https://www.bostonglobe.com/magazine/'),
    ]
Code:
#!/usr/bin/env python

from calibre.web.feeds.news import BasicNewsRecipe

class IndiaToday(BasicNewsRecipe):
    title = 'India Today'
    oldest_article = 7
    use_embedded_content = False
    no_stylesheets = True
    keep_only_tags = [
        dict(name='div', attrs={'class': 'description'}),
        dict(name='div', attrs={'class': 'col-lg-9'}),
        dict(name='div', attrs={'class': 'img-center'}),
        dict(name='p'),
    ]
    remove_tags = [
        dict(name='div', attrs={'class': 'm-t-30'}),
        dict(name='ul', attrs={'class': 'more-story'}),
        dict(name='span', attrs={'class': 'time'}),
    ]

    feeds = [
        ('India Today', 'https://www.indiatoday.in/rss/1206578'),
    ]
write a calibre news recipe for boston globe print edition it forgot to put in the title
Code:
#!/usr/bin/env python

import datetime
import json
import re
import sys
import time
import urllib.request

from bs4 import BeautifulSoup

def download_pdf(url, filename):
    response = urllib.request.urlopen(url)
    with open(filename, "wb") as f:
        f.write(response.read())

def main():
    today = datetime.datetime.now().date()
    filename = f"Boston Globe {today}.pdf"

    url = f"https://www.bostonglobe.com/edition/{today}"
    soup = BeautifulSoup(urllib.request.urlopen(url), "html.parser")

    pdf_url = soup.find("a", string="Download PDF").get("href")
    download_pdf(pdf_url, filename)

    print("Done.")

if __name__ == "__main__":
    sys.exit(main())
All these codes were generated yesterday. i asked similar questions today and the answers are very different.

write a calibre news recipe for the economist

Here's a sample Calibre recipe for The Economist, which you can use to automatically fetch and compile news articles from the publication's website into an ebook format.

yaml
Copy code
Code:
Title: The Economist

Identifier: economist

Author: The Economist

Language: en

Formats: pdf,epub,mobi

Publish: yes

Tags: news, business, politics, economics

URL: https://www.economist.com/

Filters:
- type: content
  path: div[class="teaser__body"] h3 a
  url_prefix: https://www.economist.com
  attrs: {rel: "bookmark"}
  exclude:
    - .paywall
  format:
    title: .teaser__title
    content: div[class="teaser__body"]
  strip_elements:
    - .component
    - .teaser__title
    - .teaser__category
    - .teaser__author
    - .teaser__date
Note that this recipe is just a basic example and can be customized to your preferences by modifying the filters and strip elements. Calibre is a powerful tool and can be used to automatically fetch and compile news articles from a wide range of sources.

CNBC

write a calibre news recipe for cnbc

Here is a Calibre News Recipe for CNBC:

Title: CNBC News

Author: OpenAI

Description: Stay updated with the latest business news and financial analysis from CNBC.

Feed URL: https://www.cnbc.com/id/100003114/device/rss/rss.html

Categories: Business, Finance, News

Language: English

Update Interval: 12 hours

Action: All

Keep Unread: Yes

Remove Ads: Yes

Image Proxy: Yes

Custom Output Recipe:

json
Copy code
Code:
{
    "title": "{title}",
    "author": "CNBC",
    "content": "{content}",
    "content_template": " CNB{content}",
    "cover_url": "{cover}",
    "series": "CNBC News"
}

Looks like what came out from one session yesterday is very different from today's.
Attached Thumbnails
Click image for larger version

Name:	ft.png
Views:	201
Size:	67.2 KB
ID:	199414   Click image for larger version

Name:	ft2.png
Views:	210
Size:	50.1 KB
ID:	199415   Click image for larger version

Name:	tg.png
Views:	198
Size:	60.1 KB
ID:	199416  

Last edited by unkn0wn; 02-02-2023 at 11:24 PM.
unkn0wn is offline   Reply With Quote