Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 01-31-2023, 05:09 AM   #1
fengli
Connoisseur
fengli began at the beginning.
 
Posts: 80
Karma: 10
Join Date: Aug 2022
Device: PC
Help, Jerusalem Post recipe capture failure,

please take a look, thank you very much!

Failed feed: Arab Israeli Conflict
HTTP Error 403: Forbidden
fengli is offline   Reply With Quote
Old 02-02-2023, 04:39 AM   #2
unkn0wn
Evangelist
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 448
Karma: 82686
Join Date: May 2021
Device: kindle
Code:
def get_browser(self):
        return BasicNewsRecipe.get_browser(self, user_agent='common_words/based')
add this to the recipe
unkn0wn is offline   Reply With Quote
Old 02-02-2023, 04:46 AM   #3
unkn0wn
Evangelist
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 448
Karma: 82686
Join Date: May 2021
Device: kindle
I asked chatGPT to write the recipe

write a calibre news recipe for jerusalem post

Here's an example of a Calibre news recipe for The Jerusalem Post:

Title: The Jerusalem Post

Author: OpenAI

Description: A recipe for downloading the latest news from The Jerusalem Post for offline reading.

Step 1: Install Calibre

To use this recipe, you will need to have Calibre installed on your computer. Calibre is a free and open-source software that can be downloaded from the Calibre website.

Step 2: Add the recipe

In Calibre, click on the "Fetch news" button and then select "Configure news sources." In the "News Recipes" section, click the "Import" button and paste the following recipe:

PHP Code:
#!/usr/bin/env python

from calibre.web.feeds.news import BasicNewsRecipe

class TheJerusalemPost(BasicNewsRecipe):
    
title 'The Jerusalem Post'
    
oldest_article 7
    use_embedded_content 
False
    no_stylesheets 
True
    keep_only_tags 
= [
        
dict(name='h1'attrs={'class''headline'}),
        
dict(name='div'attrs={'class''article-text'}),
    ]

    
feeds = [
        (
'Top Stories''https://www.jpost.com/Rss/RssFeedsHeadlines.aspx'),
        (
'Israel News''https://www.jpost.com/Rss/RssFeedsIsrael.aspx'),
        (
'International News''https://www.jpost.com/Rss/RssFeedsWorld.aspx'),
        (
'Opinion''https://www.jpost.com/Rss/RssFeedsOpinion.aspx'),
        (
'Features''https://www.jpost.com/Rss/RssFeedsFeatures.aspx'),
    ] 
Step 3: Download the latest news

In Calibre, click on the "Fetch news" button and select the "The Jerusalem Post" recipe. The latest news from The Jerusalem Post will be downloaded and saved to your Calibre library.

Step 4: Read the news

You can now read the latest news from The Jerusalem Post offline using Calibre or another e-reader.

Note: This recipe was last tested on [insert date]. The Jerusalem Post may make changes to their website that break this recipe, so please let me know if you encounter any issues.

The code doesn't work, keep_tags seems to be wrong.. it was able to find the rss links though
unkn0wn is offline   Reply With Quote
Old 02-02-2023, 06:30 AM   #4
fengli
Connoisseur
fengli began at the beginning.
 
Posts: 80
Karma: 10
Join Date: Aug 2022
Device: PC
Quote:
Originally Posted by unkn0wn View Post
Code:
def get_browser(self):
        return BasicNewsRecipe.get_browser(self, user_agent='common_words/based')
add this to the recipe

Still failing, and only capturing two headlines, not all, and capturing de only headlines, no content, please help to look again Thank you very much


from calibre.web.feeds.news import BasicNewsRecipe

class TheJerusalemPost(BasicNewsRecipe):
title = 'The Jerusalem Post'
oldest_article = 30
use_embedded_content = False
no_stylesheets = True
keep_only_tags = [
dict(name='h1', attrs={'class': 'headline'}),
dict(name='div', attrs={'class': 'article-text'}),
]
def get_browser(self):
return BasicNewsRecipe.get_browser(self, user_agent='common_words/based')

feeds = [
('Top Stories', 'https://www.jpost.com/Rss/RssFeedsHeadlines.aspx'),
('Israel News', 'https://www.jpost.com/Rss/RssFeedsIsrael.aspx'),
('International News', 'https://www.jpost.com/Rss/RssFeedsWorld.aspx'),
('Opinion', 'https://www.jpost.com/Rss/RssFeedsOpinion.aspx'),
('Features', 'https://www.jpost.com/Rss/RssFeedsFeatures.aspx'),
]
fengli is offline   Reply With Quote
Old 02-02-2023, 06:34 AM   #5
fengli
Connoisseur
fengli began at the beginning.
 
Posts: 80
Karma: 10
Join Date: Aug 2022
Device: PC
file:///F:/calibre%E9%A3%9F%E8%B0%B1...2%20192445.png

Last edited by fengli; 02-02-2023 at 06:37 AM.
fengli is offline   Reply With Quote
Old 02-02-2023, 07:46 AM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,871
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That was a valiant try by ChatGPT, but sadly it looks like I cant retire just yet
kovidgoyal is offline   Reply With Quote
Old 02-02-2023, 08:09 AM   #7
unkn0wn
Evangelist
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 448
Karma: 82686
Join Date: May 2021
Device: kindle
Quote:
Originally Posted by kovidgoyal View Post
That was a valiant try by ChatGPT, but sadly it looks like I cant retire just yet
Yes. I think it just came up with the keep_only_tags based on code from available recipes..
but it some how searched for actual rss feeds and placed them into the code instead of making them up.

fengli .. don't add that code to the recipe created by chatGPT! add it to the built-in recipe.
unkn0wn is offline   Reply With Quote
Old 02-02-2023, 09:37 AM   #8
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,871
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Yeah its interesting I wonder where it got those rss feeds from. At least one of them is wrong, the features one is a 404

But here are some additions inspired by ChatGPT https://github.com/kovidgoyal/calibr...09b96a6c3eb89b
kovidgoyal is offline   Reply With Quote
Old 02-02-2023, 07:42 PM   #9
fengli
Connoisseur
fengli began at the beginning.
 
Posts: 80
Karma: 10
Join Date: Aug 2022
Device: PC
Quote:
Originally Posted by unkn0wn View Post
Yes. I think it just came up with the keep_only_tags based on code from available recipes..
but it some how searched for actual rss feeds and placed them into the code instead of making them up.

fengli .. don't add that code to the recipe created by chatGPT! add it to the built-in recipe.
The capture was successful, thanks very much, 👍
fengli is offline   Reply With Quote
Old 02-02-2023, 11:21 PM   #10
unkn0wn
Evangelist
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 448
Karma: 82686
Join Date: May 2021
Device: kindle
Quote:
Originally Posted by kovidgoyal View Post
Yeah its interesting I wonder where it got those rss feeds from. At least one of them is wrong, the features one is a 404

But here are some additions inspired by ChatGPT https://github.com/kovidgoyal/calibr...09b96a6c3eb89b
it is getting rss feeds for most websites.

it came up with the same set of instructions (with those steps) for all the recipes I asked for.

here are some of the recipes I asked it to create.
Spoiler:
write a calibre news recipe for aeon.com

Here's a Calibre news recipe for Aeon:

Title: Aeon - Ideas and Culture

Author: OpenAI

Description: A recipe for downloading the latest articles from Aeon for offline reading.

Step 1: Install Calibre

To use this recipe, you will need to have Calibre installed on your computer. Calibre is a free and open-source software that can be downloaded from the Calibre website.

Step 2: Add the recipe

In Calibre, click on the "Fetch news" button and then select "Configure news sources." In the "News Recipes" section, click the "Import" button and paste the following recipe:

python
Copy code
Code:
#!/usr/bin/env python

from calibre.web.feeds.news import BasicNewsRecipe

class Aeon(BasicNewsRecipe):
    title = 'Aeon - Ideas and Culture'
    oldest_article = 7
    use_embedded_content = False
    no_stylesheets = True
    keep_only_tags = [
        dict(name='h1', attrs={'class': 'c-article__title'}),
        dict(name='div', attrs={'class': 'c-article__body'}),
    ]
    feeds = [
        ('Latest Stories', 'https://aeon.co/feed'),
    ]
Step 3: Download the latest articles

In Calibre, click on the "Fetch news" button and select the "Aeon - Ideas and Culture" recipe. The latest articles from Aeon will be downloaded and saved to your Calibre library.

Step 4: Read the articles

You can now read the latest articles from Aeon offline using Calibre or another e-reader.

Note: This recipe was last tested on [insert date]. Aeon may make changes to their website that break this recipe, so please let me know if you encounter any issues.

Other codes generated by it

I just asked for FT feeds.. and it made a calibre recipe instead

find feeds for financial times
Code:
#!/usr/bin/env python

from calibre.web.feeds.news import BasicNewsRecipe

class FinancialTimes(BasicNewsRecipe):
    title = 'Financial Times'
    oldest_article = 7
    use_embedded_content = False
    no_stylesheets = True
    feeds = [
        ('Latest News', 'https://www.ft.com/rss/home/uk'),
        ('World News', 'https://www.ft.com/rss/world'),
        ('Companies & Markets', 'https://www.ft.com/rss/companies'),
        ('Opinion', 'https://www.ft.com/rss/comment'),
        ('Life & Arts', 'https://www.ft.com/rss/life-arts'),
    ]
Code:
#!/usr/bin/env python

from calibre.web.feeds.news import BasicNewsRecipe

class CNBC(BasicNewsRecipe):
    title = 'CNBC'
    oldest_article = 7
    use_embedded_content = False
    no_stylesheets = True
    keep_only_tags = [
        dict(name='h1', attrs={'class': 'ArticleHeader-headline'}),
        dict(name='p', attrs={'class': 'ArticleBody-articleBody'}),
    ]

    feeds = [
        ('CNBC', 'https://www.cnbc.com/id/10000108/device/rss/rss.html'),
    ]
Code:
#!/usr/bin/env python

from calibre.web.feeds.news import BasicNewsRecipe

class BostonGlobeMagazine(BasicNewsRecipe):
    title = 'Boston Globe Magazine'
    oldest_article = 7
    use_embedded_content = False
    no_stylesheets = True
    keep_only_tags = [
        dict(name='div', attrs={'class': 'col-lg-9'}),
        dict(name='div', attrs={'class': 'grid-post-headline-container'}),
        dict(name='h2'),
        dict(name='p'),
    ]
    remove_tags = [
        dict(name='div', attrs={'class': 'grid-post-image-container'}),
        dict(name='div', attrs={'class': 'grid-post-meta-container'}),
    ]

    feeds = [
        ('Boston Globe Magazine', 'https://www.bostonglobe.com/magazine/'),
    ]
Code:
#!/usr/bin/env python

from calibre.web.feeds.news import BasicNewsRecipe

class IndiaToday(BasicNewsRecipe):
    title = 'India Today'
    oldest_article = 7
    use_embedded_content = False
    no_stylesheets = True
    keep_only_tags = [
        dict(name='div', attrs={'class': 'description'}),
        dict(name='div', attrs={'class': 'col-lg-9'}),
        dict(name='div', attrs={'class': 'img-center'}),
        dict(name='p'),
    ]
    remove_tags = [
        dict(name='div', attrs={'class': 'm-t-30'}),
        dict(name='ul', attrs={'class': 'more-story'}),
        dict(name='span', attrs={'class': 'time'}),
    ]

    feeds = [
        ('India Today', 'https://www.indiatoday.in/rss/1206578'),
    ]
write a calibre news recipe for boston globe print edition it forgot to put in the title
Code:
#!/usr/bin/env python

import datetime
import json
import re
import sys
import time
import urllib.request

from bs4 import BeautifulSoup

def download_pdf(url, filename):
    response = urllib.request.urlopen(url)
    with open(filename, "wb") as f:
        f.write(response.read())

def main():
    today = datetime.datetime.now().date()
    filename = f"Boston Globe {today}.pdf"

    url = f"https://www.bostonglobe.com/edition/{today}"
    soup = BeautifulSoup(urllib.request.urlopen(url), "html.parser")

    pdf_url = soup.find("a", string="Download PDF").get("href")
    download_pdf(pdf_url, filename)

    print("Done.")

if __name__ == "__main__":
    sys.exit(main())
All these codes were generated yesterday. i asked similar questions today and the answers are very different.

write a calibre news recipe for the economist

Here's a sample Calibre recipe for The Economist, which you can use to automatically fetch and compile news articles from the publication's website into an ebook format.

yaml
Copy code
Code:
Title: The Economist

Identifier: economist

Author: The Economist

Language: en

Formats: pdf,epub,mobi

Publish: yes

Tags: news, business, politics, economics

URL: https://www.economist.com/

Filters:
- type: content
  path: div[class="teaser__body"] h3 a
  url_prefix: https://www.economist.com
  attrs: {rel: "bookmark"}
  exclude:
    - .paywall
  format:
    title: .teaser__title
    content: div[class="teaser__body"]
  strip_elements:
    - .component
    - .teaser__title
    - .teaser__category
    - .teaser__author
    - .teaser__date
Note that this recipe is just a basic example and can be customized to your preferences by modifying the filters and strip elements. Calibre is a powerful tool and can be used to automatically fetch and compile news articles from a wide range of sources.

CNBC

write a calibre news recipe for cnbc

Here is a Calibre News Recipe for CNBC:

Title: CNBC News

Author: OpenAI

Description: Stay updated with the latest business news and financial analysis from CNBC.

Feed URL: https://www.cnbc.com/id/100003114/device/rss/rss.html

Categories: Business, Finance, News

Language: English

Update Interval: 12 hours

Action: All

Keep Unread: Yes

Remove Ads: Yes

Image Proxy: Yes

Custom Output Recipe:

json
Copy code
Code:
{
    "title": "{title}",
    "author": "CNBC",
    "content": "{content}",
    "content_template": " CNB{content}",
    "cover_url": "{cover}",
    "series": "CNBC News"
}

Looks like what came out from one session yesterday is very different from today's.
Attached Thumbnails
Click image for larger version

Name:	ft.png
Views:	75
Size:	67.2 KB
ID:	199414   Click image for larger version

Name:	ft2.png
Views:	78
Size:	50.1 KB
ID:	199415   Click image for larger version

Name:	tg.png
Views:	79
Size:	60.1 KB
ID:	199416  

Last edited by unkn0wn; 02-02-2023 at 11:24 PM.
unkn0wn is offline   Reply With Quote
Old 02-02-2023, 11:31 PM   #11
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,871
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That's surprising to me. It means it is adjusting its model continuously. How does anyone use it reliably for anything? If the very same question elicits different answers after a few hours...
kovidgoyal is offline   Reply With Quote
Old 02-02-2023, 11:56 PM   #12
unkn0wn
Evangelist
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 448
Karma: 82686
Join Date: May 2021
Device: kindle
Quote:
Originally Posted by kovidgoyal View Post
That's surprising to me. It means it is adjusting its model continuously. How does anyone use it reliably for anything? If the very same question elicits different answers after a few hours...
Yes. I thought others would also get similar answers from the same prompts.

Looks like it was only a one session thing. Lucky that I didn't clear yesterdays session yet otherwise it would've looked like I made it all up.
Attached Thumbnails
Click image for larger version

Name:	it1.png
Views:	81
Size:	51.0 KB
ID:	199417   Click image for larger version

Name:	it2.png
Views:	76
Size:	50.6 KB
ID:	199418   Click image for larger version

Name:	bg1.png
Views:	73
Size:	50.3 KB
ID:	199419   Click image for larger version

Name:	bg2.png
Views:	72
Size:	52.1 KB
ID:	199420   Click image for larger version

Name:	ae1.png
Views:	78
Size:	51.0 KB
ID:	199421  
unkn0wn is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Jerusalem Post fails mjfriedman Recipes 1 12-09-2021 09:44 PM
Jerusalem Post mjfriedman Recipes 0 06-01-2021 11:35 AM
Shalom from Jerusalem kadrak Introduce Yourself 12 10-19-2010 10:35 PM
newsweek recipe failure scwehrl Calibre 12 05-19-2009 04:57 PM
Problems with the Jerusalem Post Feed mccande Calibre 1 12-05-2008 02:00 PM


All times are GMT -4. The time now is 12:06 PM.


MobileRead.com is a privately owned, operated and funded community.