|
|
#1 |
|
plus ça change
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 101
Karma: 32134
Join Date: Dec 2009
Location: France
Device: Kindle PW2, Voyage
|
Recipe for NRC Handelsblad (RSS feeds)
Code:
__license__ = 'GPL v3'
__copyright__ = '2012'
'''
nrc.nl
'''
import re
from calibre.web.feeds.recipes import BasicNewsRecipe
class NRC(BasicNewsRecipe):
title = 'NRC Handelsblad'
__author__ = 'veezh'
description = 'Nieuws'
oldest_article = 1
max_articles_per_feed = 100
no_stylesheets = True
#delay = 1
use_embedded_content = False
encoding = 'utf-8'
publisher = 'nrc.nl'
category = 'news, Netherlands, world'
language = 'nl_NL'
timefmt = ''
#publication_type = 'newsportal'
extra_css = '''
h1{font-size:130%;}
#h2{font-size:100%;font-weight:normal;}
#.href{font-size:xx-small;}
.bijschrift{color:#666666; font-size:x-small;}
#.main-article-info{font-family:Arial,Helvetica,sans-serif;}
#full-contents{font-size:small; font-family:Arial,Helvetica,sans-serif;font-weight:normal;}
#match-stats-summary{font-size:small; font-family:Arial,Helvetica,sans-serif;font-weight:normal;}
'''
#preprocess_regexps = [(re.compile(r'<!--.*?-->', re.DOTALL), lambda m: '')]
conversion_options = {
'comments' : description
,'tags' : category
,'language' : language
,'publisher' : publisher
,'linearize_tables': True
}
remove_empty_feeds = True
filterDuplicates = True
def preprocess_html(self, soup):
for alink in soup.findAll('a'):
if alink.string is not None:
tstr = alink.string
alink.replaceWith(tstr)
return soup
keep_only_tags = [dict(name='div', attrs={'class':'article'})]
remove_tags_after = [dict(id='broodtekst')]
# keep_only_tags = [
# dict(name='div', attrs={'class':['label']})
# ]
# remove_tags_after = [dict(name='dl', attrs={'class':['tags']})]
# def get_article_url(self, article):
# link = article.get('link')
# if 'blog' not in link and ('chat' not in link):
# return link
feeds = [
# ('Nieuws', 'http://www.nrc.nl/rss.php'),
('Binnenland', 'http://www.nrc.nl/nieuws/categorie/binnenland/rss.php'),
('Buitenland', 'http://www.nrc.nl/nieuws/categorie/buitenland/rss.php'),
('Economie', 'http://www.nrc.nl/nieuws/categorie/economie/rss.php'),
('Wetenschap', 'http://www.nrc.nl/nieuws/categorie/wetenschap/rss.php'),
('Cultuur', 'http://www.nrc.nl/nieuws/categorie/cultuur/rss.php'),
('Boeken', 'http://www.nrc.nl/boeken/rss.php'),
('Tech', 'http://www.nrc.nl/tech/rss.php/'),
('Klimaat', 'http://www.nrc.nl/klimaat/rss.php/'),
]
|
|
|
|
|
|
#2 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,616
Karma: 28549044
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Isn't there already a recipe for NRC Handelblad?
|
|
|
|
|
|
#3 |
|
plus ça change
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 101
Karma: 32134
Join Date: Dec 2009
Location: France
Device: Kindle PW2, Voyage
|
Yes, but as far as I know, it hasn't worked for quite some time because of major changes to the website.
|
|
|
|
|
|
#4 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,616
Karma: 28549044
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Isn't the current recipe a subscription based one that downloads the epub published by Handlesblad? You saying that doesn't work?
|
|
|
|
|
|
#5 |
|
plus ça change
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 101
Karma: 32134
Join Date: Dec 2009
Location: France
Device: Kindle PW2, Voyage
|
Sorry for the confusion. AFAIK, neither the RSS recipe by Darko Miletic (the built-in recipe called NRC) nor the epub recipe works now. The first one stopped working quite some time ago when the website changed, and I'm guessing the epub one doesn't work any more, since the epub version of the paper is now behind a pay wall.
Last edited by veezh; 03-29-2012 at 04:55 AM. |
|
|
|
|
|
#6 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,616
Karma: 28549044
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Ah OK
|
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| NRC Handelsblad | veezh | Recipes | 3 | 03-07-2011 11:50 AM |
| Is there a good way to convert partial rss to full rss feeds. | Zorz | Other formats | 5 | 05-29-2010 01:17 PM |
| RSS feeds | peejay | PocketBook | 2 | 04-26-2010 06:16 AM |
| PRS-300 RSS Feeds | denmarks | Sony Reader | 1 | 10-06-2009 02:41 PM |
| RSS Feeds | troutyluc | iRex | 5 | 07-04-2008 09:18 AM |