View Single Post
Old 07-24-2022, 10:29 AM   #12
bugmen00t
Connoisseur
bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!
 
bugmen00t's Avatar
 
Posts: 82
Karma: 100000
Join Date: Aug 2015
Device: Kindle Keyboard 3G + Kindle Voyage WiFi + Kindle PW11 Kids WiFi
New Recipes (part 03 of ??)

NEW ENGLISH RECIPES (OF RUSSIAN SOURCES)

Habr (English version): collaborative blog about IT, computer science and Internet. Favicon.
Spoiler:
Code:
#!/usr/bin/env python
# vim:fileencoding=utf-8

from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class Habr(BasicNewsRecipe):
    title          		  = 'Habr'
    __author__            = 'bugmen00t'
    description           = 'Russian collaborative blog about IT, computer science and anything related to the Internet'
    publisher             = 'Habr Blockchain Publishing LTD'
    category              = 'blog'
    cover_url = u'https://hsto.org/webt/f1/lq/ka/f1lqkaveikdfqkb_rip_4vq4s_8.png'
    language              = 'en_RU'
    no_stylesheets        = True
    remove_javascript = False
    auto_cleanup   = False
    oldest_article = 30
    max_articles_per_feed = 30

    remove_tags_before = dict(name='h1')
    
    remove_tags_after = dict(name='div', attrs={'class':'tm-misprint-area'})

    remove_tags =   [
        dict(name='div', attrs={'class': 'tm-article-presenter__meta'}),
        dict(name='div', attrs={'class': 'tm-article-poll'})
        ] 

    feeds = [
        ('News', 'https://habr.com/en/rss/news/?fl=en'),
        ('All materials', 'https://habr.com/en/rss/all?fl=en')
    ]

    def preprocess_html(self, soup):
        for img in soup.findAll('img', attrs={'data-src': True}):
            img['src'] = img['data-src']
        return soup



NEW ENGLISH RECIPES (OF UKRAINIAN SOURCES)

Interfax Ukraine (English version): Interfax-Ukraine News Agency. Favicon replacement.
Spoiler:
Code:
#!/usr/bin/env python
# vim:fileencoding=utf-8

from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class InterfaxUAEN(BasicNewsRecipe):
    title          		  = 'Interfax-Ukraine'
    __author__            = 'bugmen00t'
    description           = 'The Interfax-Ukraine News Agency, founded in 1992, is subsidiary of Interfax Information Services.'
    publisher             = 'Interfax-Ukraine News Agency'
    category              = 'newspaper'
    cover_url = u'https://interfax.com.ua/static/articles/images/interfax_ukraine_logo_eng.svg'
    language              = 'en_UK'
    no_stylesheets        = True
    remove_javascript = False
    auto_cleanup   = False
    oldest_article = 3
    max_articles_per_feed = 30

    remove_tags_before = dict(name='article', attrs={'class':'article article-content-view'})
    
    remove_tags_after = dict(name='article', attrs={'class':'article article-content-view'})

    remove_tags =   [
        dict(name='div', attrs={'class': 'grid article-content-secondary-header'}),
        dict(name='div', attrs={'class': 'article-tags'}),
        ] 

    feeds = [
        ('Latest news', 'https://en.interfax.com.ua/news/last.rss')
    ]


Ukrainska Pravda (English version): Ukraninan online newspaper with an emphasis on the politics. Favicon.
Fixes needed:
  • Unable to render cyrillic text fragments correctly
Spoiler:
Code:
#!/usr/bin/env python
# vim:fileencoding=utf-8

from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class PravdaUAEN(BasicNewsRecipe):
    title          		  = 'Ukrainska Pravda'
    __author__            = 'bugmen00t'
    description           = ' Ukrainian online newspaper founded by Georgiy Gongadze with an emphasis on the politics of Ukraine. '
    publisher             = 'pravda.com.ua'
    category              = 'newspaper'
    cover_url = u'https://img.pravda.com/images/up_for_fb.gif'
    language              = 'en_UK'
    no_stylesheets        = False
    remove_javascript = False
    auto_cleanup   = False
    oldest_article = 30
    max_articles_per_feed = 30

    remove_tags_before = dict(name='h1')
    
    remove_tags_after = dict(name='article', attrs={'class':'post'})

    remove_tags =   [
        dict(name='footer'),
        dict(name='div', attrs={'class': 'nts-video-wrapper'}),
        dict(name='div', attrs={'class': 'post-report'}),
        dict(name='div', attrs={'class': 'post__report'}),
        dict(name='div', attrs={'class': 'social_item'}),
        dict(name='div', attrs={'class': 'sidebar'}),
        dict(name='div', attrs={'class': 'article-announcement-photo article-announcement-photo-block-1'}),
        dict(name='div', attrs={'class': 'statistic-bottom-block statistic-top-block'}),
        dict(name='div', attrs={'class': 'modal modal_search modal_white'}),
        dict(name='div', attrs={'class': 'modal_auth modal_white'}),
        dict(name='div', attrs={'class': 'main_logo'}),
        dict(name='div', attrs={'class': 'footer_banner'}),
        dict(name='nav', attrs={'class': 'block block_menu'}),
        dict(name='div', attrs={'class': 'pagewrap page-point'}),
        dict(name='div', attrs={'class': 'modal fade search-popup'}),
        dict(name='div', attrs={'data-vr-zone': 'Mobile main menu'}),
        dict(name='aside'),
        dict(name='div', attrs={'class': 'block_related'}),
        dict(name='div', attrs={'class': 'block_comments'}),
        dict(name='div', attrs={'class': 'post_tags'}),
        dict(name='div', attrs={'class': 'post__tags'})
        ] 

    feeds = [
        ('All materials', 'https://www.pravda.com.ua/eng/rss/'),
        ('Top news', 'https://www.pravda.com.ua/eng/rss/view_mainnews/'),
        ('News', 'https://www.pravda.com.ua/eng/rss/view_news/'),
        ('Articles', 'https://www.pravda.com.ua/eng/rss/view_pubs/')
    ]




NEW RUSSIAN RECIPES

Хабр: collaborative blog about IT, computer science and Internet. Favicon.
Spoiler:
Code:
#!/usr/bin/env python
# vim:fileencoding=utf-8

from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class Habr(BasicNewsRecipe):
    title          		  = '\u0425\u0430\u0431\u0440'
    __author__            = 'bugmen00t'
    description           = '\u041D\u0430 \u200B\u0425\u0430\u0431\u0440\u0435 \u200B\u0434\u0443\u043C\u0430\u044E\u0449\u0438\u0435 \u200B\u043B\u044E\u0434\u0438 \u200B\u0434\u0435\u043B\u044F\u0442\u0441\u044F \u200B\u0443\u043D\u0438\u043A\u0430\u043B\u044C\u043D\u044B\u043C \u200B\u200B\u043E\u043F\u044B\u0442\u043E\u043C. \u200B\u200B\u0417\u0434\u0435\u0441\u044C \u0431\u0443\u0434\u0435\u0442 \u200B\u200B\u043E\u0434\u0438\u043D\u0430\u043A\u043E\u0432\u043E \u200B\u0438\u043D\u0442\u0435\u0440\u0435\u0441\u043D\u043E \u200B\u043F\u0440\u043E\u0433\u0440\u0430\u043C\u043C\u0438\u0441\u0442\u0430\u043C \u200B\u0438 \u0436\u0443\u0440\u043D\u0430\u043B\u0438\u0441\u0442\u0430\u043C, \u200B\u200B\u0430\u0434\u043C\u0438\u043D\u0430\u043C \u200B\u0438 \u0440\u0435\u043A\u043B\u0430\u043C\u0449\u0438\u043A\u0430\u043C, \u200B\u0430\u043D\u0430\u043B\u0438\u0442\u0438\u043A\u0430\u043C \u200B\u0438 \u0434\u0438\u0437\u0430\u0439\u043D\u0435\u0440\u0430\u043C, \u200B\u043C\u0435\u043D\u0435\u0434\u0436\u0435\u0440\u0430\u043C \u200B\u0432\u044B\u0441\u0448\u0435\u0433\u043E \u200B\u0438 \u0441\u0440\u0435\u0434\u043D\u0435\u0433\u043E \u200B\u0437\u0432\u0435\u043D\u0430, \u200B\u0432\u043B\u0430\u0434\u0435\u043B\u044C\u0446\u0430\u043C \u200B\u043A\u0440\u0443\u043F\u043D\u044B\u0445 \u200B\u043A\u043E\u043C\u043F\u0430\u043D\u0438\u0439 \u200B\u0438 \u043D\u0435\u0431\u043E\u043B\u044C\u0448\u0438\u0445 \u200B\u0444\u0438\u0440\u043C, \u200B\u0430 \u0442\u0430\u043A\u0436\u0435 \u200B\u0432\u0441\u0435\u043C \u200B\u0442\u0435\u043C, \u200B\u0434\u043B\u044F \u043A\u043E\u0433\u043E \u200BIT \u2014 \u200B\u044D\u0442\u043E \u043D\u0435 \u043F\u0440\u043E\u0441\u0442\u043E \u200B\u0434\u0432\u0435 \u0431\u0443\u043A\u0432\u044B \u200B\u0430\u043B\u0444\u0430\u0432\u0438\u0442\u0430.'
    publisher             = 'Habr Blockchain Publishing LTD'
    category              = 'blog'
    cover_url = u'https://habr.com/img/habr_ru.png'
    language              = 'ru'
    no_stylesheets        = True
    remove_javascript = False
    auto_cleanup   = False
    oldest_article = 7
    max_articles_per_feed = 50

    remove_tags_before = dict(name='h1')
    
    remove_tags_after = dict(name='div', attrs={'class':'tm-misprint-area'})

    remove_tags =   [
        dict(name='div', attrs={'class': 'tm-article-presenter__meta'}),
        dict(name='div', attrs={'class': 'tm-article-poll'})
        ] 

    feeds = [
        ('\u041D\u043E\u0432\u043E\u0441\u0442\u0438', 'https://habr.com/ru/rss/news/?fl=ru'),
        ('\u0412\u0441\u0435 \u043C\u0430\u0442\u0435\u0440\u0438\u0430\u043B\u044B', 'https://habr.com/ru/rss/all/all/?fl=ru'),
        ('\u0420\u0435\u0439\u0442\u0438\u043D\u0433 \u226510', 'https://habr.com/ru/rss/all/top10/?fl=ru'),
        ('\u0420\u0435\u0439\u0442\u0438\u043D\u0433 \u226525', 'https://habr.com/ru/rss/all/top25/?fl=ru'),
        ('\u0420\u0435\u0439\u0442\u0438\u043D\u0433 \u226550', 'https://habr.com/ru/rss/all/top50/?fl=ru'),
        ('\u0420\u0435\u0439\u0442\u0438\u043D\u0433 \u2265100', 'https://habr.com/ru/rss/all/top100/?fl=ru'),
    ]

    def preprocess_html(self, soup):
        for img in soup.findAll('img', attrs={'data-src': True}):
            img['src'] = img['data-src']
        return soup


Нож: online magazine about society, psychology, science and culture. Favicon.
Spoiler:
Code:
#!/usr/bin/env python
# vim:fileencoding=utf-8

from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class KnifeMedia(BasicNewsRecipe):
    title          		  = '\u041D\u043E\u0436'
    __author__            = 'bugmen00t'
    description           = '\u0418\u043D\u0442\u0435\u043B\u043B\u0435\u043A\u0442\u0443\u0430\u043B\u044C\u043D\u044B\u0439 \u0436\u0443\u0440\u043D\u0430\u043B \u043E \u043A\u0443\u043B\u044C\u0442\u0443\u0440\u0435 \u0438 \u043E\u0431\u0449\u0435\u0441\u0442\u0432\u0435'
    publisher             = '\u041C\u0438\u0445\u0430\u0438\u043B \u0426\u044B\u0433\u0430\u043D, \u0422\u0430\u0442\u044C\u044F\u043D\u0430 \u041A\u043E\u044D\u043D'
    category              = 'blog'
    cover_url = u'https://knife.media/feature/pdd/img/knife_logo.33a98aee.svg'
    language              = 'ru'
    no_stylesheets        = False
    remove_javascript = False
    auto_cleanup   = False
    oldest_article = 14
    max_articles_per_feed = 30

    remove_tags_before = dict(name='div', attrs={'class':'entry-header'})
    
    remove_tags_after = dict(name='div', attrs={'class':'entry-content'})

    remove_tags =   [
        dict(name='aside'),
        dict(name='div', attrs={'class': 'entry-header__share share'}),
        dict(name='div', attrs={'class': 'entry-comments'}),
        dict(name='div', attrs={'class': 'entry-footer'}),
        dict(name='div', attrs={'class': 'entry-bottom'}),
        dict(name='figure', attrs={'class': 'figure figure--similar'})
        ] 

    feeds = [
        ('\u041B\u043E\u043D\u0433\u0440\u0438\u0434\u044B', 'https://knife.media/category/longreads/feed/'),
        ('\u041D\u043E\u0432\u043E\u0441\u0442\u0438', 'https://knife.media/category/news/feed/')
    ]


Интерфакс Украина: Interfax-Ukraine News Agency. Favicon replacement.
Spoiler:
Code:
#!/usr/bin/env python
# vim:fileencoding=utf-8

from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class InterfaxUARU(BasicNewsRecipe):
    title          		  = '\u0418\u043D\u0442\u0435\u0440\u0444\u0430\u043A\u0441-\u0423\u043A\u0440\u0430\u0438\u043D\u0430'
    __author__            = 'bugmen00t'
    description           = '\u0418\u043D\u0444\u043E\u0440\u043C\u0430\u0446\u0438\u044F \u043E \u043F\u043E\u0441\u043B\u0435\u0434\u043D\u0438\u0445 \u0441\u043E\u0431\u044B\u0442\u0438\u044F\u0445 \u0432 \u043F\u043E\u043B\u0438\u0442\u0438\u043A\u0435 \u0423\u043A\u0440\u0430\u0438\u043D\u044B, \u043A\u043B\u044E\u0447\u0435\u0432\u044B\u0435 \u0443\u043A\u0440\u0430\u0438\u043D\u0441\u043A\u0438\u0435 \u044D\u043A\u043E\u043D\u043E\u043C\u0438\u0447\u0435\u0441\u043A\u0438\u0435 \u043D\u043E\u0432\u043E\u0441\u0442\u0438 \u0438 \u043E\u0441\u043D\u043E\u0432\u043D\u044B\u0435 \u0441\u043E\u0431\u044B\u0442\u0438\u044F \u0432 \u0441\u0442\u0440\u0430\u043D\u0430\u0445 \u0421\u041D\u0413 \u0438 \u043C\u0438\u0440\u0430.'
    publisher             = '\u0418\u043D\u0444\u043E\u0440\u043C\u0430\u0446\u0438\u043E\u043D\u043D\u043E\u0435 \u0430\u0433\u0435\u043D\u0442\u0441\u0442\u0432\u043E \u00AB\u0418\u043D\u0442\u0435\u0440\u0444\u0430\u043A\u0441-\u0423\u043A\u0440\u0430\u0438\u043D\u0430\u00BB'
    category              = 'newspaper'
    cover_url = u'https://interfax.com.ua/static/articles/images/interfax_ukraine_logo_rus.svg'
    language              = 'ru_UK'
    no_stylesheets        = True
    remove_javascript = False
    auto_cleanup   = False
    oldest_article = 2
    max_articles_per_feed = 30

    remove_tags_before = dict(name='article', attrs={'class':'article article-content-view'})
    
    remove_tags_after = dict(name='article', attrs={'class':'article article-content-view'})

    remove_tags =   [
        dict(name='div', attrs={'class': 'grid article-content-secondary-header'}),
        dict(name='div', attrs={'class': 'article-tags'}),
        ] 

    feeds = [
        ('\u041D\u043E\u0432\u043E\u0441\u0442\u0438', 'https://ru.interfax.com.ua/news/last.rss')
    ]


Украинская правда: Ukraninan online newspaper with an emphasis on the politics. Favicon.
Spoiler:
Code:
#!/usr/bin/env python
# vim:fileencoding=utf-8

from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class PravdaUARU(BasicNewsRecipe):
    title          		  = '\u0423\u043A\u0440\u0430\u0438\u043D\u0441\u043A\u0430\u044F \u043F\u0440\u0430\u0432\u0434\u0430'
    __author__            = 'bugmen00t'
    description           = '\u0418\u043D\u0442\u0435\u0440\u043D\u0435\u0442-\u0438\u0437\u0434\u0430\u043D\u0438\u0435, \u043E\u0441\u043D\u043E\u0432\u043D\u0430\u044F \u0442\u0435\u043C\u0430\u0442\u0438\u043A\u0430 \u2014 \u043F\u043E\u043B\u0438\u0442\u0438\u043A\u0430, \u0441\u043E\u0446\u0438\u0430\u043B\u044C\u043D\u044B\u0435 \u043F\u0440\u043E\u0431\u043B\u0435\u043C\u044B, \u044D\u043A\u043E\u043D\u043E\u043C\u0438\u043A\u0430. '
    publisher             = 'pravda.com.ua'
    category              = 'newspaper'
    cover_url = u'https://img.pravda.com/images/up_for_fb.gif'
    language              = 'ru_UK'
    no_stylesheets        = False
    remove_javascript = False
    auto_cleanup   = False
    oldest_article = 7
    max_articles_per_feed = 30

    remove_tags_before = dict(name='h1')
    
    remove_tags_after = dict(name='article', attrs={'class':'post'})

    remove_tags =   [
        dict(name='footer'),
        dict(name='div', attrs={'class': 'nts-video-wrapper'}),
        dict(name='div', attrs={'class': 'post-report'}),
        dict(name='div', attrs={'class': 'post__report'}),
        dict(name='div', attrs={'class': 'social_item'}),
        dict(name='div', attrs={'class': 'sidebar'}),
        dict(name='div', attrs={'class': 'article-announcement-photo article-announcement-photo-block-1'}),
        dict(name='div', attrs={'class': 'statistic-bottom-block statistic-top-block'}),
        dict(name='div', attrs={'class': 'modal modal_search modal_white'}),
        dict(name='div', attrs={'class': 'modal_auth modal_white'}),
        dict(name='div', attrs={'class': 'main_logo'}),
        dict(name='div', attrs={'class': 'footer_banner'}),
        dict(name='nav', attrs={'class': 'block block_menu'}),
        dict(name='div', attrs={'class': 'pagewrap page-point'}),
        dict(name='div', attrs={'class': 'modal fade search-popup'}),
        dict(name='div', attrs={'data-vr-zone': 'Mobile main menu'}),
        dict(name='aside'),
        dict(name='div', attrs={'class': 'block_related'}),
        dict(name='div', attrs={'class': 'block_comments'}),
        dict(name='div', attrs={'class': 'post_tags'}),
        dict(name='div', attrs={'class': 'post__tags'})
        ] 

    feeds = [
        ('\u0412\u0441\u0435 \u043C\u0430\u0442\u0435\u0440\u0438\u0430\u043B\u044B', 'https://www.pravda.com.ua/rus/rss/'),
        ('\u0413\u043B\u0430\u0432\u043D\u044B\u0435 \u043D\u043E\u0432\u043E\u0441\u0442\u0438', 'https://www.pravda.com.ua/rus/rss/view_mainnews/'),
        ('\u041D\u043E\u0432\u043E\u0441\u0442\u0438', 'https://www.pravda.com.ua/rus/rss/view_news/'),
        ('\u041F\u0443\u0431\u043B\u0438\u043A\u0430\u0446\u0438\u0438', 'https://www.pravda.com.ua/rus/rss/view_pubs/'),
    ]


NEW UKRAINIAN RECIPES

Interfax Україна: Interfax-Ukraine News Agency. Favicon replacement.
Spoiler:
Code:
#!/usr/bin/env python
# vim:fileencoding=utf-8

from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class InterfaxUAUA(BasicNewsRecipe):
    title          		  = '\u0406\u043D\u0442\u0435\u0440\u0444\u0430\u043A\u0441-\u0423\u043A\u0440\u0430\u0457\u043D\u0430'
    __author__            = 'bugmen00t'
    description           = '\u0406\u043D\u0444\u043E\u0440\u043C\u0430\u0446\u0456\u044F \u043F\u0440\u043E \u043E\u0441\u0442\u0430\u043D\u043D\u0456 \u043F\u043E\u0434\u0456\u0457 \u0432 \u043F\u043E\u043B\u0456\u0442\u0438\u0446\u0456 \u0423\u043A\u0440\u0430\u0457\u043D\u0438, \u043A\u043B\u044E\u0447\u043E\u0432\u0456 \u0443\u043A\u0440\u0430\u0457\u043D\u0441\u044C\u043A\u0456 \u0435\u043A\u043E\u043D\u043E\u043C\u0456\u0447\u043D\u0456 \u043D\u043E\u0432\u0438\u043D\u0438 \u0442\u0430 \u043E\u0441\u043D\u043E\u0432\u043D\u0456 \u043F\u043E\u0434\u0456\u0457 \u0432 \u043A\u0440\u0430\u0457\u043D\u0430\u0445 \u0421\u041D\u0414 \u0456 \u0441\u0432\u0456\u0442\u0443.'
    publisher             = '\u0406\u043D\u0444\u043E\u0440\u043C\u0430\u0446\u0456\u0439\u043D\u0435 \u0430\u0433\u0435\u043D\u0442\u0441\u0442\u0432\u043E \u00AB\u0406\u043D\u0442\u0435\u0440\u0444\u0430\u043A\u0441-\u0423\u043A\u0440\u0430\u0457\u043D\u0430\u00BB'
    category              = 'newspaper'
    cover_url = u'https://interfax.com.ua/static/articles/images/interfax_ukraine_logo_ukr.svg'
    language              = 'uk'
    no_stylesheets        = True
    remove_javascript = False
    auto_cleanup   = False
    oldest_article = 2
    max_articles_per_feed = 30

    remove_tags_before = dict(name='article', attrs={'class':'article article-content-view'})
    
    remove_tags_after = dict(name='article', attrs={'class':'article article-content-view'})

    remove_tags =   [
        dict(name='div', attrs={'class': 'grid article-content-secondary-header'}),
        dict(name='div', attrs={'class': 'article-tags'}),
        ] 

    feeds = [
        ('\u041D\u043E\u0432\u0438\u043D\u0438', 'https://interfax.com.ua/news/last.rss')
    ]


Українська правда: Ukraninan online newspaper with an emphasis on the politics. Favicon.
Fixes needed:
  • Text/codepage encoding error in some articles
Spoiler:
Code:
#!/usr/bin/env python
# vim:fileencoding=utf-8

from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class PravdaUAUA(BasicNewsRecipe):
    title          		  = '\u0423\u043A\u0440\u0430\u0457\u043D\u0441\u044C\u043A\u0430 \u043F\u0440\u0430\u0432\u0434\u0430'
    __author__            = 'bugmen00t'
    description           = '\u0423\u043A\u0440\u0430\u0457\u043D\u0441\u044C\u043A\u0435 \u0441\u0443\u0441\u043F\u0456\u043B\u044C\u043D\u043E-\u043F\u043E\u043B\u0456\u0442\u0438\u0447\u043D\u0435 \u0456\u043D\u0442\u0435\u0440\u043D\u0435\u0442-\u0417\u041C\u0406'
    publisher             = 'pravda.com.ua'
    category              = 'newspaper'
    cover_url = u'https://img.pravda.com/images/up_for_fb.gif'
    language              = 'uk'
    no_stylesheets        = False
    remove_javascript = False
    auto_cleanup   = False
    oldest_article = 7
    max_articles_per_feed = 30

    remove_tags_before = dict(name='h1')
    
    remove_tags_after = dict(name='article', attrs={'class':'post'})

    remove_tags =   [
        dict(name='footer'),
        dict(name='div', attrs={'class': 'nts-video-wrapper'}),
        dict(name='div', attrs={'class': 'post-report'}),
        dict(name='div', attrs={'class': 'post__report'}),
        dict(name='div', attrs={'class': 'social_item'}),
        dict(name='div', attrs={'class': 'sidebar'}),
        dict(name='div', attrs={'class': 'article-announcement-photo article-announcement-photo-block-1'}),
        dict(name='div', attrs={'class': 'statistic-bottom-block statistic-top-block'}),
        dict(name='div', attrs={'class': 'modal modal_search modal_white'}),
        dict(name='div', attrs={'class': 'modal_auth modal_white'}),
        dict(name='div', attrs={'class': 'main_logo'}),
        dict(name='div', attrs={'class': 'footer_banner'}),
        dict(name='nav', attrs={'class': 'block block_menu'}),
        dict(name='div', attrs={'class': 'pagewrap page-point'}),
        dict(name='div', attrs={'class': 'modal fade search-popup'}),
        dict(name='div', attrs={'data-vr-zone': 'Mobile main menu'}),
        dict(name='aside'),
        dict(name='div', attrs={'class': 'block_related'}),
        dict(name='div', attrs={'class': 'block_comments'}),
        dict(name='div', attrs={'class': 'post_tags'}),
        dict(name='div', attrs={'class': 'post__tags'})
        ] 

    feeds = [
        ('\u0412\u0441\u0456 \u043C\u0430\u0442\u0435\u0440\u0456\u0430\u043B\u0438', 'https://www.pravda.com.ua/rss/'),
        ('\u041D\u0430\u0439\u0432\u0430\u0436\u043B\u0438\u0432\u0456\u0448\u0456 \u043D\u043E\u0432\u0438\u043D\u0438', 'https://www.pravda.com.ua/rss/view_mainnews/'),
        ('\u041D\u043E\u0432\u0438\u043D\u0438', 'https://www.pravda.com.ua/rss/view_news/'),
        ('\u041F\u0443\u0431\u043B\u0456\u043A\u0430\u0446\u0456\u0457', 'https://www.pravda.com.ua/rss/view_pubs/'),
    ]
bugmen00t is offline   Reply With Quote