My coding skills are rather deplorable.
I might have re-written this code 50 times but it never works.
The recipe should be quite simple. These are the premises.
1. RSS FEED source:
https://aeon.co/feed.rss
2. Exclude "videos".
3. Output EPUB
PHP Code:
import re
from calibre.ebooks.conversion.plumber import Plumber
from calibre.web.feeds.recipes import BasicNewsRecipe
class AeonRecipe(BasicNewsRecipe):
title = 'Aeon'
__author__ = 'FacetiousKnave'
description = 'This recipe fetches articles from Aeon and outputs an EPUB file'
use_embedded_content = False
remove_tags = [
dict(name='iframe')
]
def parse_index(self):
items = self.index_to_soup('https://aeon.co/feed.rss').find_all('item')
for item in items:
title = item.title.text
if 'video' not in title.lower():
url = item.link.text
date = item.pubdate.text
self.add_article(title, url, date, text=self.fetch_article(url))
def postprocess_html(self, soup, first_fetch):
# Remove unwanted tags
for tag in self.remove_tags:
for t in soup.find_all(**tag):
t.decompose()
return soup
What am I doing wrong?