View Single Post
Old 07-19-2022, 04:41 PM   #5
bugmen00t
Connoisseur
bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!bugmen00t rocks like Gibraltar!
 
bugmen00t's Avatar
 
Posts: 82
Karma: 100000
Join Date: Aug 2015
Device: Kindle Keyboard 3G + Kindle Voyage WiFi + Kindle PW11 Kids WiFi
Built-in Russian recipes

FIXED RUSSAIN RECIPES


Improved built-in 3DNews: Daily Digital Digest recipe (3dnews.recipe): HTTPS, revised RSS feeds.
Spoiler:
Code:
#!/usr/bin/env python
# vim:fileencoding=utf-8

from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class News(BasicNewsRecipe):
    title = '3DNews: Daily Digital Digest'
    __author__ = 'bugmen00t'
    description = '\u041D\u0435\u0437\u0430\u0432\u0438\u0441\u0438\u043C\u043E\u0435 \u0440\u043E\u0441\u0441\u0438\u0439\u0441\u043A\u043E\u0435 \u043E\u043D\u043B\u0430\u0439\u043D-\u0438\u0437\u0434\u0430\u043D\u0438\u0435, \u043F\u043E\u0441\u0432\u044F\u0449\u0435\u043D\u043D\u043E\u0435 \u0446\u0438\u0444\u0440\u043E\u0432\u044B\u043C \u0442\u0435\u0445\u043D\u043E\u043B\u043E\u0433\u0438\u044F\u043C'
    publisher = '3DNews'
    category = 'news'
    cover_url = u'http://www.3dnews.ru/assets/images/logo.png'
    language = 'ru'
    auto_cleanup = True

    oldest_article = 15
    max_articles_per_feed = 60

    feeds = [
        ('\u0412\u0430\u0436\u043D\u044B\u0435 \u043D\u043E\u0432\u043E\u0441\u0442\u0438','https://3dnews.ru/breaking/rss/'),
        ('\u0412\u0441\u0435 \u043D\u043E\u0432\u043E\u0441\u0442\u0438','https://3dnews.ru/news/rss/'),
        ('\u041D\u043E\u0432\u043E\u0441\u0442\u0438 - \u0445\u0430\u0440\u0434','https://3dnews.ru/hardware-news/rss'),
        ('\u041D\u043E\u0432\u043E\u0441\u0442\u0438 - \u0433\u0430\u0434\u0436\u0435\u0442\u044B','https://3dnews.ru/gadgets/rss/'),
        ('\u041D\u043E\u0432\u043E\u0441\u0442\u0438 - \u0441\u043E\u0444\u0442','https://3dnews.ru/software-news/rss/'),
        ('\u041D\u043E\u0432\u043E\u0441\u0442\u0438 - \u0438\u0433\u0440\u044B','https://3dnews.ru/games/rss/'),
        ('\u0423\u043C\u043D\u044B\u0435 \u0412\u0435\u0449\u0438','https://3dnews.ru/smart-things/rss/'),
        ('\u0410\u043D\u0430\u043B\u0438\u0442\u0438\u043A\u0430','https://3dnews.ru/editorial/rss/'),
        ('\u041F\u0440\u043E\u0446\u0435\u0441\u0441\u043E\u0440\u044B \u0438 \u043F\u0430\u043C\u044F\u0442\u044C','https://3dnews.ru/cpu/rss/'),
        ('\u041C\u0430\u0442\u0435\u0440\u0438\u043D\u0441\u043A\u0438\u0435 \u043F\u043B\u0430\u0442\u044B','https://3dnews.ru/motherboard/rss/'),
        ('\u041A\u043E\u0440\u043F\u0443\u0441\u0430, \u0411\u041F \u0438 \u043E\u0445\u043B\u0430\u0436\u0434\u0435\u043D\u0438\u0435','https://3dnews.ru/cooling/rss/'),
        ('\u0412\u0438\u0434\u0435\u043E\u043A\u0430\u0440\u0442\u044B','https://3dnews.ru/video/rss/'),
        ('\u041C\u043E\u043D\u0438\u0442\u043E\u0440\u044B \u0438 \u043F\u0440\u043E\u0435\u043A\u0442\u043E\u0440\u044B','https://3dnews.ru/display/rss/'),
        ('\u041D\u0430\u043A\u043E\u043F\u0438\u0442\u0435\u043B\u0438','https://3dnews.ru/storage/rss/'),
        ('\u0426\u0438\u0444\u0440\u043E\u0432\u043E\u0439 \u0430\u0432\u0442\u043E\u043C\u043E\u0431\u0438\u043B\u044C','https://3dnews.ru/auto/rss/'),
        ('\u0421\u043E\u0442\u043E\u0432\u0430\u044F \u0441\u0432\u044F\u0437\u044C','https://3dnews.ru/phone/rss/'),
        ('\u041F\u0435\u0440\u0438\u0444\u0435\u0440\u0438\u044F','https://3dnews.ru/peripheral/rss/'),
        ('\u041D\u043E\u0443\u0442\u0431\u0443\u043A\u0438 \u0438 \u041F\u041A','https://3dnews.ru/mobile/rss/'),
        ('\u041F\u043B\u0430\u043D\u0448\u0435\u0442\u044B','https://3dnews.ru/tablets/rss/'),
        ('\u0417\u0432\u0443\u043A \u0438 \u0430\u043A\u0443\u0441\u0442\u0438\u043A\u0430','https://3dnews.ru/multimedia/rss/'),
        ('\u0426\u0438\u0444\u0440\u043E\u0432\u043E\u0435 \u0444\u043E\u0442\u043E \u0438 \u0432\u0438\u0434\u0435\u043E','https://3dnews.ru/digital/rss/'),
        ('\u0421\u0435\u0442\u0438 \u0438 \u043A\u043E\u043C\u043C\u0443\u043D\u0438\u043A\u0430\u0446\u0438\u0438','https://3dnews.ru/communication/rss/'),
        ('\u041F\u0440\u043E\u0433\u0440\u0430\u043C\u043C\u043D\u043E\u0435 \u043E\u0431\u0435\u0441\u043F\u0435\u0447\u0435\u043D\u0438\u0435','https://3dnews.ru/software/rss/'),
        ('Off-\u0441\u044F\u043D\u043A\u0430','https://3dnews.ru/offsyanka/rss/'),
        ('\u041C\u0430\u0441\u0442\u0435\u0440\u0441\u043A\u0430\u044F','https://3dnews.ru/workshop/rss/'),
        ('ServerNews - \u0441\u0442\u0430\u0442\u044C\u0438','https://servernews.ru/rss'),
        ('ServerNews - \u043D\u043E\u0432\u043E\u0441\u0442\u0438','https://servernews.ru/news/rss')
    ]

    def print_version(self, url):
        return url + '/print'


Improved built-in 7x7 recipe (7x7.recipe): new domain, revised RSS feeds. Unable to fetch lazyloaded images, so text-only output for now Bonus: favicon
Spoiler:
Code:
#!/usr/bin/env python
# vim:fileencoding=utf-8

from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class News(BasicNewsRecipe):
    title = '7x7'
    __author__ = 'bugmen00t'
    description = '7x7 - \u043C\u0435\u0436\u0440\u0435\u0433\u0438\u043E\u043D\u0430\u043B\u044C\u043D\u044B\u0439 \u0438\u043D\u0442\u0435\u0440\u043D\u0435\u0442-\u0436\u0443\u0440\u043D\u0430\u043B'
    publisher = '7x7-journal.ru'
    category = 'news'
    cover_url = u'https://semnasem.org/site-specific/7x7-journal.ru/images/frontend/logo/logo-header.svg'
    language = 'ru'
    no_stylesheets = True
    remove_javascript = True
    auto_cleanup = False

    oldest_article = 14
    max_articles_per_feed = 30

    feeds = [
        ('7x7', 'https://semnasem.org/rss/default.xml'),
    ]

    remove_tags_before = dict(name='article',attrs={'class': 'article'})
    
    remove_tags_after = dict(name='div', attrs={'class': 'article__footer-wrap'})
    
    remove_tags = [
        dict(name='div', attrs={'class': 'article__footer-wrap'}),
        dict(name='div', attrs={'class': 'promolink-widget'})
    ]


Fixed built-in Izvestia recipe (izvestia.recipe): HTTPS, revised RSS feeds.
Spoiler:
Code:
#!/usr/bin/env python
# vim:fileencoding=utf-8

__license__ = 'GPL v3'
__copyright__ = '2010, Darko Miletic <darko.miletic at gmail.com>'
'''
izvestia.ru
'''

from calibre.web.feeds.news import BasicNewsRecipe

class Izvestia(BasicNewsRecipe):
    title = 'Izvestia'
    __author__ = 'Darko Miletic (with fixes by bugmen00t)'
    description = 'News from Russia'
    publisher = 'Izvestia'
    category = 'news, politics, Russia'
    oldest_article = 5
    max_articles_per_feed = 100
    auto_cleanup = False
    no_stylesheets = True
    use_embedded_content = False
    language = 'ru'
    publication_type = 'newspaper'
    cover_url = u'https://cdn.iz.ru/profiles/portal/themes/purple/images/favicons/apple-icon-180x180.png'

    remove_tags_before = dict(name='div', attrs={'role': 'article'})
  
    remove_tags_after = dict(name='div', attrs={'role': 'article'})
    
    remove_tags = [
        dict(name='div', attrs={'class': 'article_page__left__top__views'}),
        dict(name='div', attrs={'class': 'hash_tags'}),
        dict(name='div', attrs={'class': 'get_yandex_subscription_links'}),
        dict(name='div', attrs={'class': 'article_buttons_block'}),
        dict(name='div', attrs={'class': 'rubrics_btn'}),
        dict(name='div', attrs={'class': 'hidden'}),
        dict(name='div', attrs={'class': 'share_bottom2'}),                        
        dict(name='div', attrs={'class': 'recommendation-block'}),                        
        dict(name='div', attrs={'class': 'plug-text'}),                                     
        dict(name='div', attrs={'class': 'get_news_link'}),                        
        dict(name='div', attrs={'itemprop': 'address'})
        ]

    feeds = [
        (u'Новости', u'https://iz.ru/xml/rss/all.xml')]

def preprocess_html(self, soup):
        for img in soup.findAll('img', attrs={'data-src': True}):
            img['src'] = img['data-src']
        return soup


Fixed built-in Kommersant recipe (kommersant.recipe): HTTPS, revised RSS feeds. Couldn't figure out how to keep images though, will be grateful if anyone could
Spoiler:
Code:
#!/usr/bin/env python
# vim:fileencoding=utf-8

__license__ = 'GPL v3'
__copyright__ = '2010-2013, Darko Miletic <darko.miletic at gmail.com>'
'''
www.kommersant.ru
'''

from calibre.web.feeds.news import BasicNewsRecipe


class Kommersant_ru(BasicNewsRecipe):
    title = 'Kommersant'
    __author__ = 'Darko Miletic (with fixes by bugmen00t)'
    description = 'News from Russia'
    publisher = 'Kommersant'
    category = 'news, politics, Russia'
    oldest_article = 7
    max_articles_per_feed = 50
    no_stylesheets = True
    use_embedded_content = False
    language = 'ru'
    publication_type = 'newspaper'
    cover_url = 'https://iv.kommersant.ru/ContentFlex/images/logo.png'

    remove_tags_before = dict(name='header', attrs={'class': 'doc_header'})

    remove_tags_after = dict(name='div', attrs={'class': 'doc__text document_authors'})

    remove_tags = [
        dict(name='ul', attrs={'class': 'crumbs'}),
        dict(name='div', attrs={'class': 'hide_desktop'}),
        dict(name='div', attrs={'class': 'incut incut--right'}),
        dict(name='div', attrs={'class': 'incut incut--left'}),
        dict(name='div', attrs={'class': 'incut incut--center'}),
        dict(name='div', attrs={'class': 'ba'}),
        dict(name='div', attrs={'id': 'lenta'}),        
        dict(name='div', attrs={'class': 'layout basement_news__body'}),
        dict(name='footer', attrs={'class': 'footer'}),
        dict(name='div', attrs={'class': 'ui-modal'}),
        dict(name='section', attrs={'class': 'potd'}),
        dict(name='footer', attrs={'class': 'doc_footer'}),
        dict(name='div', attrs={'class': 'adv_interscroll hide_desktop'})
        ]

    feeds = [
        ('\u0413\u043B\u0430\u0432\u043D\u043E\u0435','https://www.kommersant.ru/rss/main.xml'),
        ('\u0413\u0430\u0437\u0435\u0442\u0430 "\u041A\u043E\u043C\u043C\u0435\u0440\u0441\u0430\u043D\u0442"','https://www.kommersant.ru/rss/daily.xml'),
        ('\u041B\u0435\u043D\u0442\u0430 \u043D\u043E\u0432\u043E\u0441\u0442\u0435\u0439','https://www.kommersant.ru/RSS/news.xml'),
        ('\u041C\u0430\u0442\u0435\u0440\u0438\u0430\u043B\u044B \u0441 \u0441\u0430\u0439\u0442\u0430','https://www.kommersant.ru/RSS/corp.xml'),
        ('\u0420\u0430\u0434\u0438\u043E \u041A\u043E\u043C\u043C\u0435\u0440\u0441\u0430\u043D\u0442\u044A-FM','https://www.kommersant.ru/RSS/radio.xml'),
        ('\u0422\u0435\u043C\u0430\u0442\u0438\u0447\u0435\u0441\u043A\u0438\u0435 \u043F\u0440\u0438\u043B\u043E\u0436\u0435\u043D\u0438\u044F','https://www.kommersant.ru/RSS/tema.xml'),
        ('\u0416\u0443\u0440\u043D\u0430\u043B \u00AB\u041E\u0413\u041E\u041D\u0401\u041A\u00BB','https://www.kommersant.ru/RSS/ogoniok.xml'),
        ('\u0416\u0443\u0440\u043D\u0430\u043B \u00AB\u041A\u043E\u043C\u043C\u0435\u0440\u0441\u0430\u043D\u0442\u044A WEEKEND\u00BB','https://www.kommersant.ru/RSS/weekend.xml'),
        ('\u0416\u0443\u0440\u043D\u0430\u043B \u00AB\u041A\u043E\u043C\u043C\u0435\u0440\u0441\u0430\u043D\u0442\u044A \u0410\u0412\u0422\u041E\u041F\u0418\u041B\u041E\u0422\u00BB','https://www.kommersant.ru/RSS/auto.xml'),
        ('\u041F\u043E\u043B\u0438\u0442\u0438\u043A\u0430','https://www.kommersant.ru/rss/section-politics.xml'),
        ('\u042D\u043A\u043E\u043D\u043E\u043C\u0438\u043A\u0430','https://www.kommersant.ru/RSS/section-economics.xml'),
        ('\u0411\u0438\u0437\u043D\u0435\u0441','https://www.kommersant.ru/rss/section-business.xml'),
        ('\u0412 \u043C\u0438\u0440\u0435','https://www.kommersant.ru/rss/section-world.xml'),
        ('\u041F\u0440\u043E\u0438\u0441\u0448\u0435\u0441\u0442\u0432\u0438\u044F','https://www.kommersant.ru/rss/section-accidents.xml'),
        ('\u041E\u0431\u0449\u0435\u0441\u0442\u0432\u043E','https://www.kommersant.ru/rss/section-society.xml'),
        ('\u041A\u0443\u043B\u044C\u0442\u0443\u0440\u0430','https://www.kommersant.ru/rss/section-culture.xml'),
        ('\u0421\u043F\u043E\u0440\u0442','https://www.kommersant.ru/rss/section-sport.xml'),
        ('Hi-Tech','https://www.kommersant.ru/RSS/section-hitech.xml'),
        ('\u0410\u0432\u0442\u043E','https://www.kommersant.ru/RSS/Autopilot_on.xml'),
        ('\u0421\u0442\u0438\u043B\u044C','https://www.kommersant.ru/RSS/section-style.xml'),
        ('\u0421\u0430\u043D\u043A\u0442-\u041F\u0435\u0442\u0435\u0440\u0431\u0443\u0440\u0433','https://www.kommersant.ru/rss/regions/piter_all.xml'),
        ('\u0412\u043E\u0440\u043E\u043D\u0435\u0436','https://www.kommersant.ru/rss/regions/vrn_all.xml'),
        ('\u0415\u043A\u0430\u0442\u0435\u0440\u0438\u043D\u0431\u0443\u0440\u0433','https://www.kommersant.ru/rss/regions/ekaterinburg_all.xml'),
        ('\u0418\u0436\u0435\u0432\u0441\u043A','https://www.kommersant.ru/rss/regions/izhevsk_all.xml'),
        ('\u041A\u0430\u0437\u0430\u043D\u044C','https://www.kommersant.ru/rss/regions/kazan_all.xml'),
        ('\u041A\u0440\u0430\u0441\u043D\u043E\u0434\u0430\u0440','https://www.kommersant.ru/rss/regions/krasnodar_all.xml'),
        ('\u041A\u0440\u0430\u0441\u043D\u043E\u044F\u0440\u0441\u043A','https://www.kommersant.ru/rss/regions/krasnoyarsk_all.xml'),
        ('\u041D\u0438\u0436\u043D\u0438\u0439 \u041D\u043E\u0432\u0433\u043E\u0440\u043E\u0434','https://www.kommersant.ru/rss/regions/nnov_all.xml'),
        ('\u041D\u043E\u0432\u043E\u0441\u0438\u0431\u0438\u0440\u0441\u043A','https://www.kommersant.ru/rss/regions/novosibirsk_all.xml'),
        ('\u041F\u0435\u0440\u043C\u044C','https://www.kommersant.ru/rss/regions/perm_all.xml'),
        ('\u0420\u043E\u0441\u0442\u043E\u0432-\u043D\u0430-\u0414\u043E\u043D\u0443','https://www.kommersant.ru/rss/regions/rostov_all.xml'),
        ('\u0421\u0430\u043C\u0430\u0440\u0430','https://www.kommersant.ru/rss/regions/samara_all.xml'),
        ('\u0421\u0430\u0440\u0430\u0442\u043E\u0432','https://www.kommersant.ru/rss/regions/saratov_all.xml'),
        ('\u0423\u0444\u0430','https://www.kommersant.ru/rss/regions/ufa_all.xml'),
        ('\u0427\u0435\u043B\u044F\u0431\u0438\u043D\u0441\u043A','https://www.kommersant.ru/rss/regions/chelyabinsk_all.xml')
    ]


Fixed built-in RBC.ru recipe (rbc_ru.recipe): HTTPS, page cleanup, revised RSS feeds.
Spoiler:
Code:
from calibre.web.feeds.news import BasicNewsRecipe


class RBC_ru(BasicNewsRecipe):
    title = u'RBC.ru'
    __author__ = 'A. Chewi (with fixes by bugmen00t)'
    description = u'\u0420\u043E\u0441\u0441\u0438\u0439\u0441\u043A\u043E\u0435 \u0438\u043D\u0444\u043E\u0440\u043C\u0430\u0446\u0438\u043E\u043D\u043D\u043E\u0435 \u0430\u0433\u0435\u043D\u0442\u0441\u0442\u0432\u043E \u00AB\u0420\u043E\u0441\u0411\u0438\u0437\u043D\u0435\u0441\u041A\u043E\u043D\u0441\u0430\u043B\u0442\u0438\u043D\u0433\u00BB (\u0420\u0411\u041A) - \u043B\u0435\u043D\u0442\u044B \u043D\u043E\u0432\u043E\u0441\u0442\u0435\u0439 \u043F\u043E\u043B\u0438\u0442\u0438\u043A\u0438, \u044D\u043A\u043E\u043D\u043E\u043C\u0438\u043A\u0438 \u0438 \u0444\u0438\u043D\u0430\u043D\u0441\u043E\u0432, \u0430\u043D\u0430\u043B\u0438\u0442\u0438\u0447\u0435\u0441\u043A\u0438\u0435 \u043C\u0430\u0442\u0435\u0440\u0438\u0430\u043B\u044B, \u043A\u043E\u043C\u043C\u0435\u043D\u0442\u0430\u0440\u0438\u0438 \u0438 \u043F\u0440\u043E\u0433\u043D\u043E\u0437\u044B, \u0442\u0435\u043C\u0430\u0442\u0438\u0447\u0435\u0441\u043A\u0438\u0435 \u0441\u0442\u0430\u0442\u044C\u0438'  # noqa
    needs_subscription = False
    cover_url = 'https://pics.rbc.ru/img/fp_v4/skin/img/logo.gif'
    cover_margins = (80, 160, '#ffffff')
    oldest_article = 20
    max_articles_per_feed = 50
    summary_length = 200
    remove_empty_feeds = True
    no_stylesheets = True
    remove_javascript = True
    use_embedded_content = False
    conversion_options = {'linearize_tables': True}
    auto_cleanup = True
    language = 'ru'
    timefmt = ' [%a, %d %b, %Y]'

    feeds = [(u'RSS \u043D\u043E\u0432\u043E\u0441\u0442\u0438', u'https://rssexport.rbc.ru/rbcnews/news/30/full.rss'),
             (u'\u0413\u043B\u0430\u0432\u043D\u044B\u0435\u0020\u043D\u043E\u0432\u043E\u0441\u0442\u0438', u'http://static.feed.rbc.ru/rbc/internal/rss.rbc.ru/rbc.ru/news.rss'),
             ]


Fixed built-in RIA Novosti - Russian recipe (ria_ru.recipe): HTTPS, revised RSS feeds.
Spoiler:
Code:
#!/usr/bin/env python
# vim:fileencoding=utf-8

__license__ = 'GPL v3'
__copyright__ = '2010, Darko Miletic <darko.miletic at gmail.com>'
'''
www.ria.ru
'''

from calibre.web.feeds.news import BasicNewsRecipe


class RIANovosti(BasicNewsRecipe):
    title = 'RIA Novosti - Russian'
    __author__ = 'Darko Miletic (with fixes by bugmen00t)'
    description = 'News from Russia'
    publisher = '\u041C\u0418\u0410 \u00AB\u0420\u043E\u0441\u0441\u0438\u044F \u0441\u0435\u0433\u043E\u0434\u043D\u044F\u00BB\u2028 (MIA Russia Today)'
    category = 'news, politics, Russia'
    oldest_article = 7
    max_articles_per_feed = 100
    no_stylesheets = True
    use_embedded_content = False
    encoding = 'utf8'
    language = 'ru'
    publication_type = 'newsportal'
    cover_url = 'https://oldimg.ria.ru/i/ria_social.png'

    remove_tags_before = dict(name='div', attrs={'class': 'article__header'})

    remove_tags_after = dict(name='div', attrs={'class': 'article__userbar'})

    remove_tags = [
        dict(name='div', attrs={'class': 'article__userbar'}),
        dict(name='div', attrs={'class': 'article__title'}),
        dict(name='div', attrs={'class': 'article__aggr'}),
        dict(name='div', attrs={'class': 'article__article-info'})
        ]

    feeds = [
    (u'\u041B\u0435\u043D\u0442\u0430 \u043D\u043E\u0432\u043E\u0441\u0442\u0435\u0439', u'https://ria.ru/export/rss2/archive/index.xml')
    ]


Improved built-in TJournal recipe (tjournal.recipe): articles cleanup, revised RSS feeds. Still ugly placeholders instead of cool in-article images, that's a shame Bonus: updated favicon
Spoiler:
Code:
#!/usr/bin/env python
# vim:fileencoding=utf-8


class TJournal(BasicNewsRecipe):
    title = u'TJournal'
    __author__ = 'bug_me_not (with fixes by bugmen00t)'
    description = 'TJournal: \u0438\u0437\u0434\u0430\u043D\u0438\u0435 \u043E \u043C\u0435\u0434\u0438\u0430, \u0442\u0435\u0445\u043D\u043E\u043B\u043E\u0433\u0438\u044F\u0445 \u0438 \u0442\u0440\u0435\u043D\u0434\u0430\u0445'
    publisher = 'tjournal.ru'
    category = 'news'
    language = 'ru'
    no_stylesheets = False
    remove_javascript = True
    oldest_article = 30
    max_articles_per_feed = 100
    cover_url = 'https://tjournal.ru/static/build/tjournal.ru/images/search_logo.png'

    remove_tags_before = dict(
        name='div', attrs={'class': 'content-title"'})

    remove_tags_after = dict(
        name='div', attrs={'class': 'content-footer content-footer--full l-island-a'})

    remove_tags = [
        dict(name='div', attrs={'class': 'content-footer content-footer--full l-island-a'}),
        dict(name='div', attrs={'air-module': 'module.distributionFloating'}),
        dict(name='span', attrs={'class': 'content-editorial-tick'}),
        dict(name='vue'),
        dict(name='div', attrs={'class': 'comments'}),
        dict(name='div', attrs={'class': 'propaganda'}),
        dict(name='div', attrs={'class': 'propaganda propaganda--with-footer'}),
        dict(name='div', attrs={'air-module': 'module.gallery'}),
        dict(name='div', attrs={'class': 'content-container'}),
        dict(name='div', attrs={'class': 'content-header__item content-header-number'}),
        dict(name='span', attrs={'class': 'views__value'}),
        dict(name='span', attrs={'class': 'views__label'})
    ]

    feeds = [
        ('\u041F\u043E\u043F\u0443\u043B\u044F\u0440\u043D\u043E\u0435','https://tjournal.ru/rss'),
        ('\u041D\u043E\u0432\u043E\u0441\u0442\u0438','https://tjournal.ru/rss/news'),
        ('\u0421\u0432\u0435\u0436\u0435\u0435','https://tjournal.ru/rss/new'),
        ('\u0422\u0435\u0445\u043D\u043E\u043B\u043E\u0433\u0438\u0438','https://tjournal.ru/rss/tech'),
        ('\u0420\u0430\u0437\u0431\u043E\u0440\u044B','https://tjournal.ru/rss/analysis'),
        ('\u0418\u043D\u0442\u0435\u0440\u043D\u0435\u0442','https://tjournal.ru/rss/internet')
        ]

def preprocess_html(self, soup):
        for img in soup.findAll('img', attrs={'data-image-src': True}):
            img['src'] = img['data-image-src']
        return soup


Fixed built-in The Insider recipe (the_insider.recipe): HTTPS, revised RSS feeds. Bonus: favicon
Spoiler:
Code:
#!/usr/bin/env python2
# vim:fileencoding=utf-8
from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class TheInsider(BasicNewsRecipe):
    title                 = u'The Insider'
    cover_url             = u'https://s3-us-west-2.amazonaws.com/anchor-generated-image-bank/production/podcast_uploaded_nologo400/10331708/10331708-1604408816914-d03520fb339d5.jpg'
    __author__            = 'bugmen00t'
    description           = '\u0420\u0430\u0441\u0441\u043B\u0435\u0434\u043E\u0432\u0430\u043D\u0438\u044F \u0420\u0435\u043F\u043E\u0440\u0442\u0430\u0436\u0438 \u0410\u043D\u0430\u043B\u0438\u0442\u0438\u043A\u0430'
    publisher             = 'theins.ru'
    category              = 'news'
    language              = 'ru'
    no_stylesheets        = True
    remove_javascript     = True
    oldest_article        = 300
    max_articles_per_feed = 100

    remove_tags_before    = dict(name='div', attrs={'id':'wrapper'})
    remove_tags_after     = dict(name='p', attrs={'style':' color: #999999;'})
    remove_tags           = [
                            dict(name='div',attrs={'class':'post-share'}),
                            dict(name='div', attrs={'class':'post-share fixed-likes'}),
                            dict(name='div', attrs={'class':'topads'}),
                            dict(name='div', attrs={'class':'pre-content-line'}),
                            dict(name='div', attrs={'class':'author-opinions'}),
                            dict(name='div', attrs={'class':'content-banner'}),
                            dict(name='div', attrs={'id':'sidebar'})
                            ]
      

    feeds                 = [
                            (u'\u041D\u043E\u0432\u043E\u0441\u0442\u0438', u'https://theins.ru/feed')
                            ]


Improved built-in iXBT.com recipe (ixbt.recipe): revised RSS feeds.
Spoiler:
Code:
#!/usr/bin/env python2
# vim:fileencoding=utf-8

from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class Ixbt(BasicNewsRecipe):
    title = 'iXBT.com'
    __author__ = 'bugmen00t'
    description = '\u0421\u043F\u0435\u0446\u0438\u0430\u043B\u0438\u0437\u0438\u0440\u043E\u0432\u0430\u043D\u043D\u044B\u0439 \u0440\u043E\u0441\u0441\u0438\u0439\u0441\u043A\u0438\u0439 \u0438\u043D\u0444\u043E\u0440\u043C\u0430\u0446\u0438\u043E\u043D\u043D\u043E-\u0430\u043D\u0430\u043B\u0438\u0442\u0438\u0447\u0435\u0441\u043A\u0438\u0439 \u0441\u0435\u0440\u0432\u0435\u0440, \u043E\u0441\u0432\u0435\u0449\u0430\u044E\u0449\u0438\u0439 \u0432\u043E\u043F\u0440\u043E\u0441\u044B \u0430\u043F\u043F\u0430\u0440\u0430\u0442\u043D\u043E\u0433\u043E \u043E\u0431\u0435\u0441\u043F\u0435\u0447\u0435\u043D\u0438\u044F \u043F\u0435\u0440\u0441\u043E\u043D\u0430\u043B\u044C\u043D\u044B\u0445 \u043A\u043E\u043C\u043F\u044C\u044E\u0442\u0435\u0440\u043E\u0432, \u043A\u043E\u043C\u043C\u0443\u043D\u0438\u043A\u0430\u0446\u0438\u0439 \u0438 \u0441\u0435\u0440\u0432\u0435\u0440\u043E\u0432, 3D-\u0433\u0440\u0430\u0444\u0438\u043A\u0438 \u0438 \u0437\u0432\u0443\u043A\u0430, \u0446\u0438\u0444\u0440\u043E\u0432\u043E\u0433\u043E \u0444\u043E\u0442\u043E \u0438 \u0432\u0438\u0434\u0435\u043E, Hi-Fi \u0430\u043F\u043F\u0430\u0440\u0430\u0442\u0443\u0440\u044B \u0438 \u043F\u0440\u043E\u0435\u043A\u0442\u043E\u0440\u043E\u0432, \u043C\u043E\u0431\u0438\u043B\u044C\u043D\u043E\u0439 \u0441\u0432\u044F\u0437\u0438 \u0438 \u043F\u0435\u0440\u0438\u0444\u0435\u0440\u0438\u0438, \u0438\u0433\u0440\u043E\u0432\u044B\u0445 \u043F\u0440\u0438\u043B\u043E\u0436\u0435\u043D\u0438\u0439 \u0438 \u043C\u043D\u043E\u0433\u043E\u0433\u043E \u0434\u0440\u0443\u0433\u043E\u0433\u043E.'
    publisher = 'www.ixbt.com'
    category = 'news'
    cover_url = u'https://www.ixbt.com/images/ixbt-logo-new.jpg'
    language = 'ru'
    auto_cleanup = True

    oldest_article = 30
    max_articles_per_feed = 100

    remove_tags_before = dict(name='div', attrs={'class': 'content'})

    remove_tags_after = dict(name='ul', attrs={'id': 'soc_ShareBlock'})

    feeds = [
        (u'\u0421\u0442\u0430\u0442\u044C\u0438', 'http://www.ixbt.com/export/articles.rss'),
        (u'\u041D\u043E\u0432\u043E\u0441\u0442\u0438', 'http://www.ixbt.com/export/news.rss'),
        (u'\u0421\u0432\u0435\u0436\u0438\u0435 \u043D\u043E\u0432\u043E\u0441\u0442\u0438 DVD \u0438 \u0434\u043E\u043C\u0430\u0448\u043D\u0438\u0445 \u043A\u0438\u043D\u043E\u0442\u0435\u0430\u0442\u0440\u043E\u0432', 'http://www.ixbt.com/export/dvdnews.rss'),
        (u'\u0421\u0432\u0435\u0436\u0438\u0435 \u043D\u043E\u0432\u043E\u0441\u0442\u0438 \u0438\u0437 \u043C\u0438\u0440\u0430 Apple', 'http://www.ixbt.com/export/applenews.rss'),
        (u'\u041F\u0440\u043E\u0446\u0435\u0441\u0441\u043E\u0440\u044B', 'http://www.ixbt.com/export/sec_cpu.rss'),
        (u'\u0421\u0438\u0441\u0442\u0435\u043C\u043D\u044B\u0435 \u043F\u043B\u0430\u0442\u044B, \u043F\u0430\u043C\u044F\u0442\u044C \u0438 \u0447\u0438\u043F\u0441\u0435\u0442\u044B', 'http://www.ixbt.com/export/sec_mainboard.rss'),
        (u'D-\u0412\u0438\u0434\u0435\u043E \u0438 TV-\u0442\u044E\u043D\u0435\u0440\u044B', 'http://www.ixbt.com/export/sec_video.rss'),
        (u'\u0421\u0435\u0442\u0438 \u0438 \u0421\u0435\u0440\u0432\u0435\u0440\u044B', 'http://www.ixbt.com/export/sec_comm.rss'),
        (u'\u041E\u043F\u0442\u0438\u0447\u0435\u0441\u043A\u0438\u0435 \u043F\u0440\u0438\u0432\u043E\u0434\u044B \u0438 \u043D\u043E\u0441\u0438\u0442\u0435\u043B\u0438 \u0438\u043D\u0444\u043E\u0440\u043C\u0430\u0446\u0438\u0438', 'http://www.ixbt.com/export/sec_optical.rss'),
        (u'\u041F\u0440\u0438\u043D\u0442\u0435\u0440\u044B \u0438 \u041C\u0424\u0423', 'http://www.ixbt.com/export/sec_printer.rss'),
        (u'\u041C\u043E\u043D\u0438\u0442\u043E\u0440\u044B', 'http://www.ixbt.com/export/sec_monitor.rss'),
        (u'\u041D\u043E\u0441\u0438\u0442\u0435\u043B\u0438 \u0438\u043D\u0444\u043E\u0440\u043C\u0430\u0446\u0438\u0438', 'http://www.ixbt.com/export/sec_storage.rss'),
        (u'\u0426\u0438\u0444\u0440\u043E\u0432\u043E\u0439 \u0437\u0432\u0443\u043A', 'http://www.ixbt.com/export/sec_multimedia.rss'),
        (u'ProAudio', 'http://www.ixbt.com/export/sec_proaudio.rss'),
        (u'\u0418\u0437\u043E\u0431\u0440\u0430\u0436\u0435\u043D\u0438\u0435 \u0432 \u0447\u0438\u0441\u043B\u0430\u0445', 'http://www.ixbt.com/export/sec_digimage.rss'),
        (u'\u041F\u0440\u043E\u0435\u043A\u0442\u043E\u0440\u044B, \u043A\u0438\u043D\u043E \u0438 \u0434\u043E\u043C\u0430\u0448\u043D\u0438\u0435 \u043A\u0438\u043D\u043E\u0442\u0435\u0430\u0442\u0440\u044B', 'http://www.ixbt.com/export/sec_dvd.rss'),
        (u'\u0426\u0438\u0444\u0440\u043E\u0432\u043E\u0435 \u0432\u0438\u0434\u0435\u043E', 'http://www.ixbt.com/export/sec_divideo.rss'),
        (u'\u041C\u043E\u0431\u0438\u043B\u044C\u043D\u044B\u0435 \u041F\u041A', 'http://www.ixbt.com/export/sec_portopc.rss'),
        (u'\u041C\u043E\u0431\u0438\u043B\u044C\u043D\u044B\u0435 \u0443\u0441\u0442\u0440\u043E\u0439\u0441\u0442\u0432\u0430', 'http://www.ixbt.com/export/sec_pda.rss'),
        (u'\u0412\u0441\u0435\u0433\u0434\u0430 \u043D\u0430 \u0441\u0432\u044F\u0437\u0438', 'http://www.ixbt.com/export/sec_mobile.rss'),
        (u'\u041A\u043E\u0440\u043F\u0443\u0441\u0430, \u0441\u0438\u0441\u0442\u0435\u043C\u044B \u043F\u0438\u0442\u0430\u043D\u0438\u044F \u0438 \u043E\u0445\u043B\u0430\u0436\u0434\u0435\u043D\u0438\u044F', 'http://www.ixbt.com/export/sec_power.rss'),
        (u'\u041A\u043E\u043B\u043E\u043D\u043A\u0430 \u0440\u0435\u0434\u0430\u043A\u0442\u043E\u0440\u0430', 'http://www.ixbt.com/export/sec_editorial.rss'),
        (u'iXBT Live', 'https://www.ixbt.com/live/rss/index/')
        ]


Improved built-in Идеальный пиксель recipe (id_pixel.recipe): HTTPS, articles cleanup. Bonus: favicon
Spoiler:
Code:
#!/usr/bin/env python2
# vim:fileencoding=utf-8

from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class IdPixel(BasicNewsRecipe):
    title          = '\u0418\u0434\u0435\u0430\u043B\u044C\u043D\u044B\u0439 \u043F\u0438\u043A\u0441\u0435\u043B\u044C'
    cover_url = u'https://idpixel.ru/i/logo2x.png'
    description           = '\u041D\u043E\u0432\u043E\u0441\u0442\u043D\u043E\u0439 \u043F\u0440\u043E\u0435\u043A\u0442 \u043E \u0440\u0435\u0442\u0440\u043E-\u0438\u0433\u0440\u0430\u0445 \u0438 \u0440\u0435\u0442\u0440\u043E-\u0442\u0435\u0445\u043D\u0438\u043A\u0435. \u0412\u043E\u0441\u044C\u043C\u0438\u0431\u0438\u0442\u043D\u044B\u0435 \u0438\u0433\u0440\u044B, \u0448\u0435\u0441\u0442\u043D\u0430\u0434\u0446\u0430\u0442\u0438\u0431\u0438\u0442\u043D\u044B\u0435 \u043A\u043E\u043D\u0441\u043E\u043B\u0438, \u0434\u043E\u043C\u0430\u0448\u043D\u0438\u0435 \u043A\u043E\u043C\u043F\u044C\u044E\u0442\u0435\u0440\u044B \u0441 \u0438\u0433\u0440\u0430\u043C\u0438 \u043D\u0430 \u043A\u0430\u0441\u0441\u0435\u0442\u0430\u0445 \u0438 \u0442\u0430\u043A \u0434\u0430\u043B\u0435\u0435. \u041C\u044B \u0438\u0449\u0435\u043C \u0440\u0435\u0442\u0440\u043E-\u043D\u043E\u0432\u043E\u0441\u0442\u0438 \u043F\u043E \u0432\u0441\u0435\u043C\u0443 \u0441\u0432\u0435\u0442\u0443 \u0438 \u0434\u043E\u043D\u043E\u0441\u0438\u043C \u0438\u0445 \u0434\u043E \u0432\u0430\u0441.'  # noqa
    publisher             = '\u041C\u0438\u0445\u0430\u0438\u043B \u0421\u0443\u0434\u0430\u043A\u043E\u0432'
    category              = 'news'
    __author__            = 'bugmen00t'
    language              = 'ru'
    no_stylesheets        = False
    remove_javascript = True
    auto_cleanup = True
    oldest_article = 100
    max_articles_per_feed = 50

    remove_tags_before = dict(name='div', attrs={'class':'blog-post'})
    remove_tags_after  = dict(name='div', attrs={'style':'margin: 20px 0 0 2px;font-size: 16px;'})
    remove_tags     = [dict(name='div',attrs={'class':' likely__widget likely__widget_vkontakte'}),
                        dict(name='div', attrs={'class':' likely__widget likely__widget_twitter'}),
                         dict(name='div', attrs={'class':' likely__widget likely__widget_facebook'}),
                         dict(name='div', attrs={'class':' likely__widget likely__widget_telegram'}),
                         dict(name='div', attrs={'class':' likely__widget likely__widget_odnoklassniki'}),
                         dict(name='div', attrs={'class':'comments_input_disabled'}),
                         dict(name='div', attrs={'id':'comments'})
                         ]

    feeds          =      [(u'\u041D\u043E\u0432\u043E\u0441\u0442\u0438', u'https://idpixel.ru/rss/news.rss')]


Fixed built-in Компьютерра recipe (kompiutierra.recipe): HTTPS, revised RSS feeds. Bonus: updated favicon
Spoiler:
Code:
#!/usr/bin/env python2
# vim:fileencoding=utf-8

__license__ = 'GPL v3'
__copyright__ = '2015, lcd1232, malexey1984@gmail.com'
__author__ = 'lcd1232'

from calibre.web.feeds.news import BasicNewsRecipe


class Computerra(BasicNewsRecipe):
    title = u'\u041a\u043e\u043c\u043f\u044c\u044e\u0442\u0435\u0440\u0440\u0430'
    __author__ = 'lcd1232 (with fixes by bugmen00t)'
    description = u'\u041A\u043E\u043C\u043F\u044C\u044E\u0442\u0435\u0440\u0440\u0430: \u0432\u0441\u0435 \u043D\u043E\u0432\u043E\u0441\u0442\u0438 \u043F\u0440\u043E \u043A\u043E\u043C\u043F\u044C\u044E\u0442\u0435\u0440\u044B, \u0436\u0435\u043B\u0435\u0437\u043E, \u043D\u043E\u0432\u044B\u0435 \u0442\u0435\u0445\u043D\u043E\u043B\u043E\u0433\u0438\u0438, \u0438\u043D\u0444\u043E\u0440\u043C\u0430\u0446\u0438\u043E\u043D\u043D\u044B\u0435 \u0442\u0435\u0445\u043D\u043E\u043B\u043E\u0433\u0438\u0438'
    cover_url = 'https://yt3.ggpht.com/ytc/AKedOLRCMA71rKaP4HfL2W26A-VdvsBj9BcOo7S6poTR=s900-c-k-c0x00ffffff-no-rj'
    language = 'ru'
    oldest_article = 100
    max_articles_per_feed = 50
    use_embedded_content = False
    remove_javascript = True
    no_stylesheets = False
    conversion_options = {'linearize_tables': True}
    simultaneous_downloads = 5

    remove_tags_before = dict(name='div', attrs={'id': 'article'})

    remove_tags_after = dict(name='div', attrs={'class': 'article-body'})

    remove_tags = [dict(name='div', attrs={'class': 'cta-row'})]

    feeds = [(u'\u041A\u043E\u043C\u043F\u044C\u044E\u0442\u0435\u0440\u0440\u0430', 'https://www.computerra.ru/feed/')]


Fixed built-in МедиаЗона recipe (media_zone.recipe): revised RSS feeds. Bonus: favicon
Spoiler:
Code:
#!/usr/bin/env python2
# vim:fileencoding=utf-8

from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class MediaZona(BasicNewsRecipe):
    title = '\u041c\u0435\u0434\u0438\u0430\u0417\u043e\u043d\u0430'
    __author__ = 'bugmen00t'
    description = '\u041E\u0431\u0449\u0435\u0441\u0442\u0432\u0435\u043D\u043D\u043E-\u043F\u043E\u043B\u0438\u0442\u0438\u0447\u0435\u0441\u043A\u043E\u0435 \u0438\u0437\u0434\u0430\u043D\u0438\u0435, \u0441\u0434\u0435\u043B\u0430\u0432\u0448\u0435\u0435 \u0430\u043A\u0446\u0435\u043D\u0442 \u043D\u0430 \u0444\u0443\u043D\u043A\u0446\u0438\u043E\u043D\u0438\u0440\u043E\u0432\u0430\u043D\u0438\u0438 \u0437\u0430\u043A\u043E\u043D\u0430 \u0432 \u0420\u043E\u0441\u0441\u0438\u0438. \u041F\u043E \u043C\u043D\u0435\u043D\u0438\u044E \u0430\u0432\u0442\u043E\u0440\u0438\u0442\u0435\u0442\u043D\u044B\u0445 \u043C\u0435\u0434\u0438\u0430\u044D\u043A\u0441\u043F\u0435\u0440\u0442\u043E\u0432, \u043F\u043E \u0446\u0438\u0442\u0438\u0440\u0443\u0435\u043C\u043E\u0441\u0442\u0438 \u0438 \u043F\u043E\u0441\u0435\u0449\u0430\u0435\u043C\u043E\u0441\u0442\u0438 \u0444\u043E\u0440\u043C\u0430\u0442 \u00AB\u041C\u0435\u0434\u0438\u0430\u0437\u043E\u043D\u044B\u00BB \u043E\u043A\u0430\u0437\u0430\u043B\u0441\u044F \u0432\u0435\u0434\u0443\u0449\u0438\u043C \u0444\u043E\u0440\u043C\u0430\u0442\u043E\u043C \u043D\u043E\u0432\u043E\u0441\u0442\u043D\u043E\u0433\u043E \u0438\u0437\u0434\u0430\u043D\u0438\u044F \u0432 \u0420\u043E\u0441\u0441\u0438\u0438 2015 \u0433\u043E\u0434\u0430. \u00AB\u041C\u0435\u0434\u0438\u0430\u0437\u043E\u043D\u0430\u00BB \u043F\u0438\u0448\u0435\u0442 \u043E \u0440\u0435\u0430\u043B\u044C\u043D\u043E \u043F\u0440\u043E\u0438\u0441\u0445\u043E\u0434\u044F\u0449\u0435\u043C \u0432 \u0420\u043E\u0441\u0441\u0438\u0438, \u043F\u0435\u0440\u0432\u043E\u0439 \u0443\u043B\u0430\u0432\u043B\u0438\u0432\u0430\u044F \u0432\u0435\u043A\u0442\u043E\u0440\u044B \u0440\u0430\u0437\u0432\u0438\u0442\u0438\u044F \u043E\u0431\u0449\u0435\u0441\u0442\u0432\u0430.'  # noqa
    publisher = 'zona.media'
    category = 'news'
    cover_url = u'https://zona.media/s/share/default_mz.png'
    language = 'ru'
    no_stylesheets = False
    remove_javascript = True
    auto_cleanup = True

    oldest_article = 30
    max_articles_per_feed = 100

    remove_tags_before = dict(name='section', attrs={'class': 'mz-layout-content__row pt0 clearfix'})

    remove_tags_after = dict(name='div', attrs={'class': 'mz-publish__wrapper'})

    remove_tags =   [
        dict(name='div', attrs={'class': 'mz-agent-banner'}),
        dict(name='section', attrs={'data-share-id': 'post'})
        ]


    feeds = [
        ('\u041C\u0435\u0434\u0438\u0430\u0437\u043E\u043D\u0430 ', 'https://zona.media/rss'),
        ('\u0411\u0435\u043B\u0430\u0440\u0443\u0441\u044C', 'https://mediazona.by/rss'),
        ('\u0426\u0435\u043D\u0442\u0440\u0430\u043B\u044C\u043D\u0430\u044F \u0410\u0437\u0438\u044F', 'https://mediazona.ca/rss'),
        ]


Fixed built-in Правда.RU recipe (pravda_ru.recipe): HTTPS, revised RSS feeds. N.B.: it seems that site is geo-restricted and recipe probably won't work for non-Russian IPs. Bonus: updated favicon
Spoiler:
Code:
#!/usr/bin/env python2
# vim:fileencoding=utf-8

from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

__license__ = 'GPL v3'
__copyright__ = '2012, Darko Miletic <darko.miletic at gmail.com>'
'''
www.pravda.ru
'''

from calibre.web.feeds.news import BasicNewsRecipe


class Pravda_ru(BasicNewsRecipe):
    title = u'\u041F\u0440\u0430\u0432\u0434\u0430'
    __author__ = 'Darko Miletic (with fixes by bugmen00t)'
    description = u'\u041F\u0440\u0430\u0432\u0434\u0430.\u0420\u0443: \u0410\u043D\u0430\u043B\u0438\u0442\u0438\u043A\u0430 \u0438 \u043D\u043E\u0432\u043E\u0441\u0442\u0438'
    publisher = 'PRAVDA.Ru'
    category = 'news, politics, Russia'
    language = 'ru'
    publication_type = 'newspaper'
    cover_url = 'http://www.pravda.ru/pix/logo.gif'
    oldest_article = 7
    max_articles_per_feed = 50
    auto_cleanup = True
    
    remove_tags_before = dict(name='div', attrs={'class': 'full article full-article'})

    remove_tags_after = dict(name='div', attrs={'class': 'authors-block'})

    remove_tags =   [
        dict(name='div', attrs={'class': 'breadcumbs'})
        ]

    feeds = [
        (u'\u041F\u0440\u0430\u0432\u0434\u0430.RU', 'https://www.pravda.ru/export.xml'),
        (u'\u0421\u0442\u0430\u0442\u044C\u0438', 'https://www.pravda.ru/export-articles.xml'),
        (u'\u041D\u043E\u0432\u043E\u0441\u0442\u0438', 'https://www.pravda.ru/export-news.xml')
        ]


Improved built-in Троицкий вариант recipe (trv.recipe): HTTPS, articles cleanup. Decided to keep the comments as they're often more interesting than the articles itself; could be completely disabled by uncommenting one line in remove_tags section.
Spoiler:
Code:
#!/usr/bin/env python2
# vim:fileencoding=utf-8

from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class TrvScience(BasicNewsRecipe):

    title = u'\u0422\u0440\u043e\u0438\u0446\u043a\u0438\u0439 \u0432\u0430\u0440\u0438\u0430\u043d\u0442'
    language = 'ru'
    __author__ = 'Vadim Dyadkin (with fixes by bugmen00t)'
    oldest_article = 30
    max_articles_per_feed = 100
    recursion = 4
    no_stylesheets = True
    simultaneous_downloads = 1
#    cover_url = 'https://i0.wp.com/trv-science.ru/uploads/logo_trv2-e1573805568596-1.png'
    cover_url = 'https://i0.wp.com/trv-science.ru/uploads/cropped-trv_neur-1024.png'
    
    remove_tags_before = dict(name='main', attrs={'id': 'main'})

    remove_tags_after = dict(name='div', attrs={'class': 'wpdiscuz-comment-pagination'})

    remove_tags =   [
        dict(name='span', attrs={'class': 'fa fa-user'}),
        dict(name='h4'),
        dict(name='svg'),
        dict(name='ul', attrs={'class': 'st-related-posts'}),
        dict(name='footer', attrs={'class': 'entry-meta'}),
#        dict(name='div', attrs={'id': 'comments'}),
        dict(name='div', attrs={'class': 'wpd-vote'}),
        dict(name='div', attrs={'class': 'mistape_caption'}),
        dict(name='div', attrs={'class': 'wpd-comment-share wpd-hidden wpd-tooltip wpd-top'}),
        dict(name='div', attrs={'class': 'wpd-comment-left '}),
        dict(name='div', attrs={'class': 'wpd-space'}),
        dict(name='div', attrs={'class': 'wpd-reply-button'}),
        dict(name='div', attrs={'class': 'wpd-comment-link wpd-hidden'}),
        dict(name='div', attrs={'class': 'wpd-comment-last-edited'}),
        dict(name='div', attrs={'class': 'wpd-comment-date'}),
        dict(name='div', attrs={'class': 'wpd-comment-info-bar'}),
        dict(name='div', attrs={'class': 'wpd-form-wrap'})
        ]

    feeds = [(u'\u0422\u0440\u043e\u0438\u0446\u043a\u0438\u0439 \u0432\u0430\u0440\u0438\u0430\u043d\u0442',
              u'https://trv-science.ru/feed/')]

Last edited by bugmen00t; 07-23-2022 at 02:54 AM. Reason: Медиазона and Троицкий Вариант recipe small fix
bugmen00t is offline   Reply With Quote