MobileRead Forums - View Single Post - Recipe le Monde : How to keep only the URL of the printed edition ?

unkn0wn · 02-14-2023, 02:01 AM

you can use

def preprocess_raw_html(self, raw, *a):

and do raw.search to check if its print edition and then regex group the date and then parse that date by importing

from calibre.utils.date import parse_date
from datetime import datetime, timedelta

and check

if (today - date) > timedelta(1):
self.abort_article('Skipping old article')

if not print edition or if they're older than a day, use self.abort_article to abort those articles

maybe there are other methods.. figure it out.
look for similar stuff in other recipes.

02-14-2023, 02:01 AM	#2
unkn0wn Guru Posts: 646 Karma: 85520 Join Date: May 2021 Device: kindle	you can use def preprocess_raw_html(self, raw, *a): and do raw.search to check if its print edition and then regex group the date and then parse that date by importing from calibre.utils.date import parse_date from datetime import datetime, timedelta and check if (today - date) > timedelta(1): self.abort_article('Skipping old article') if not print edition or if they're older than a day, use self.abort_article to abort those articles maybe there are other methods.. figure it out. look for similar stuff in other recipes.