There are very few recipes in Japanese so I wrote a recipe which scrapes the NHK news site. But it sometimes downloads and sometimes hang and I'm not sure why it's so inconsistent. Does anyone have any advice?
Code:
from calibre.web.feeds.news import BasicNewsRecipe
class ReutersJa(BasicNewsRecipe):
# feed source: https://www.nhk.or.jp/toppage/rss/index.html
title = 'NHK News'
description = 'NHK News in Japanese'
__author__ = 'Richard A. Steps'
use_embedded_content = False
language = 'ja'
max_articles_per_feed = 30
remove_javascript = True
auto_cleanup = True
feeds = [(
'主要ニュース', 'https://www.nhk.or.jp/rss/news/cat0.xml?format=xml'),
('社会', 'https://www.nhk.or.jp/rss/news/cat1.xml?format=xml'),
('科学・医療', 'https://www.nhk.or.jp/rss/news/cat3.xml?format=xml'),
('政治', 'https://www.nhk.or.jp/rss/news/cat4.xml?format=xml'),
('経済', 'https://www.nhk.or.jp/rss/news/cat5.xml?format=xml'),
('国際', 'https://www.nhk.or.jp/rss/news/cat6.xml?format=xml'),
('スポーツ', 'https://www.nhk.or.jp/rss/news/cat7.xml?format=xml'),
('文化・エンタメ', 'https://www.nhk.or.jp/rss/news/cat2.xml?format=xml')
]