View Single Post
Old 05-24-2025, 10:55 AM   #1
Villard
Connoisseur
Villard began at the beginning.
 
Posts: 74
Karma: 10
Join Date: May 2016
Device: Koreader running on Kobo Libra 2
Need help on get_browser for La Croix recipe

Hello
I'm building a recipe for La Croix newspaper https://www.la-croix.com/
I've got a subscription to this newspaper but I can not get with my recipe the articles reserved for subscribers.
I guess I need to use get_browser but I do not manage to fix it.

The articles are under the main URl https://www.la-croix.com and the connexion URl is of this type :in plain text :
sso.la-croix.com/auth/realms/bayard/protocol/openid-connect/auth?scope=openid&state=4a8c2a5c6410a6a1cc85a38726 15ed5c&response_type=code&approval_prompt=auto&red irect_uri=https%3A%2F%2Fwww.la-croix.com%2Fconnect%2Fkeycloak%2Fcheck&client_id=l a-croix.com
The state part "state=4a8c2a5c6410a6a1cc85a3872615ed5c" seems to be random and change at every visit.

I thank you for any help you can give me

Here is a sample of my current recipe :
Code:
from calibre.web.feeds.news import BasicNewsRecipe, classes
import re

class LaCroix(BasicNewsRecipe):
    title = 'La Croix'
    needs_subscription = True
    language = 'fr'
    remove_empty_feeds = True
    ignore_duplicate_articles = {'title', 'url'}
    reverse_article_order = True

    feeds          = [('International', 'https://www.la-croix.com/feeds/rss/international.xml'),]

    keep_only_tags = [
         (dict(name='div', class_='article-container article-container__columns')),
         (dict(name='div', class_='article-content')),
    ]

    remove_tags = [
        (dict(name='div', class_='read-also')),
        (dict(name='div', section_='page-section')),
        (dict(name='div', class_='tag-list')),
        (dict(name='div', class_='list list--separator')),
        (dict(name='div', class_='list-box')),
     ]

calibre_most_common_ua = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.87 Safari/537.36'
Villard
Villard is offline   Reply With Quote