![]() |
#1 |
plus ça change
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 101
Karma: 32134
Join Date: Dec 2009
Location: France
Device: Kindle PW2, Voyage
|
Recipe for NRC Handelsblad (RSS feeds)
Code:
__license__ = 'GPL v3' __copyright__ = '2012' ''' nrc.nl ''' import re from calibre.web.feeds.recipes import BasicNewsRecipe class NRC(BasicNewsRecipe): title = 'NRC Handelsblad' __author__ = 'veezh' description = 'Nieuws' oldest_article = 1 max_articles_per_feed = 100 no_stylesheets = True #delay = 1 use_embedded_content = False encoding = 'utf-8' publisher = 'nrc.nl' category = 'news, Netherlands, world' language = 'nl_NL' timefmt = '' #publication_type = 'newsportal' extra_css = ''' h1{font-size:130%;} #h2{font-size:100%;font-weight:normal;} #.href{font-size:xx-small;} .bijschrift{color:#666666; font-size:x-small;} #.main-article-info{font-family:Arial,Helvetica,sans-serif;} #full-contents{font-size:small; font-family:Arial,Helvetica,sans-serif;font-weight:normal;} #match-stats-summary{font-size:small; font-family:Arial,Helvetica,sans-serif;font-weight:normal;} ''' #preprocess_regexps = [(re.compile(r'<!--.*?-->', re.DOTALL), lambda m: '')] conversion_options = { 'comments' : description ,'tags' : category ,'language' : language ,'publisher' : publisher ,'linearize_tables': True } remove_empty_feeds = True filterDuplicates = True def preprocess_html(self, soup): for alink in soup.findAll('a'): if alink.string is not None: tstr = alink.string alink.replaceWith(tstr) return soup keep_only_tags = [dict(name='div', attrs={'class':'article'})] remove_tags_after = [dict(id='broodtekst')] # keep_only_tags = [ # dict(name='div', attrs={'class':['label']}) # ] # remove_tags_after = [dict(name='dl', attrs={'class':['tags']})] # def get_article_url(self, article): # link = article.get('link') # if 'blog' not in link and ('chat' not in link): # return link feeds = [ # ('Nieuws', 'http://www.nrc.nl/rss.php'), ('Binnenland', 'http://www.nrc.nl/nieuws/categorie/binnenland/rss.php'), ('Buitenland', 'http://www.nrc.nl/nieuws/categorie/buitenland/rss.php'), ('Economie', 'http://www.nrc.nl/nieuws/categorie/economie/rss.php'), ('Wetenschap', 'http://www.nrc.nl/nieuws/categorie/wetenschap/rss.php'), ('Cultuur', 'http://www.nrc.nl/nieuws/categorie/cultuur/rss.php'), ('Boeken', 'http://www.nrc.nl/boeken/rss.php'), ('Tech', 'http://www.nrc.nl/tech/rss.php/'), ('Klimaat', 'http://www.nrc.nl/klimaat/rss.php/'), ] |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,310
Karma: 27111242
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Isn't there already a recipe for NRC Handelblad?
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
plus ça change
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 101
Karma: 32134
Join Date: Dec 2009
Location: France
Device: Kindle PW2, Voyage
|
Yes, but as far as I know, it hasn't worked for quite some time because of major changes to the website.
|
![]() |
![]() |
![]() |
#4 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,310
Karma: 27111242
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Isn't the current recipe a subscription based one that downloads the epub published by Handlesblad? You saying that doesn't work?
|
![]() |
![]() |
![]() |
#5 |
plus ça change
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 101
Karma: 32134
Join Date: Dec 2009
Location: France
Device: Kindle PW2, Voyage
|
Sorry for the confusion. AFAIK, neither the RSS recipe by Darko Miletic (the built-in recipe called NRC) nor the epub recipe works now. The first one stopped working quite some time ago when the website changed, and I'm guessing the epub one doesn't work any more, since the epub version of the paper is now behind a pay wall.
Last edited by veezh; 03-29-2012 at 03:55 AM. |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,310
Karma: 27111242
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Ah OK
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
NRC Handelsblad | veezh | Recipes | 3 | 03-07-2011 10:50 AM |
Is there a good way to convert partial rss to full rss feeds. | Zorz | Other formats | 5 | 05-29-2010 12:17 PM |
RSS feeds | peejay | PocketBook | 2 | 04-26-2010 05:16 AM |
PRS-300 RSS Feeds | denmarks | Sony Reader | 1 | 10-06-2009 01:41 PM |
RSS Feeds | troutyluc | iRex | 5 | 07-04-2008 08:18 AM |