Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 12-22-2010, 11:32 AM   #1
veezh
plus ça change
veezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beauty
 
veezh's Avatar
 
Posts: 101
Karma: 32134
Join Date: Dec 2009
Location: France
Device: Kindle PW2, Voyage
NRC Handelsblad

This recipe will download the full epub version of NRC Handelsblad which is (temporarily?) available free without a login.

Thanks to Lars Jacob for his Taz Digiabo recipe, on which this is heavily based, and to Starson17, who posted a reference to it in the reusable code sticky.

Code:
#!/usr/bin/env  python
# -*- coding: utf-8 -*-
#Based on Lars Jacob's Taz Digiabo recipe

__license__   = 'GPL v3'
__copyright__ = '2010, veezh'

'''
www.nrc.nl
'''
import os, urllib2, zipfile
import datetime, time
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ptempfile import PersistentTemporaryFile


class NRCHandelsblad(BasicNewsRecipe):

    title = u'NRC Handelsblad'
    description = u'De EPUB-versie van NRC'
    language = 'nl'
    lang = 'nl-NL'

    __author__ = 'veezh'

    conversion_options = {
        'no_default_epub_cover' : True
    }

    def build_index(self):
        day = datetime.date.today()
        today = time.strftime("%Y%m%d")
        domain = "http://digitaleeditie.nrc.nl"

        url = domain + "/digitaleeditie/helekrant/epub/nrc_" + today + ".epub"
#        print url

        try:
            f = urllib2.urlopen(url)
        except urllib2.HTTPError:
            self.report_progress(0,_('Kan niet inloggen om editie te downloaden'))
            raise ValueError('Krant van vandaag nog niet beschikbaar')

        tmp = PersistentTemporaryFile(suffix='.epub')
        self.report_progress(0,_('downloading epub'))
        tmp.write(f.read())
        tmp.close()

        zfile = zipfile.ZipFile(tmp.name, 'r')
        self.report_progress(0,_('extracting epub'))

        zfile.extractall(self.output_dir)

        tmp.close()
        index = os.path.join(self.output_dir, 'content.opf')

        self.report_progress(1,_('epub downloaded and extracted'))

        return index
veezh is offline   Reply With Quote
Old 12-22-2010, 12:10 PM   #2
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by veezh View Post
Thanks to Lars Jacob for his Taz Digiabo recipe, on which this is heavily based, and to Starson17, who posted a reference to it in the reusable code sticky.
And thanks to Kovid for telling us about Lars Jacob and his Taz Digiabo recipe. Without Kovid's pointer I don't think we'd have known it was possible to directly grab an EPUB with a recipe.

I surely didn't.
Starson17 is offline   Reply With Quote
Advert
Old 03-07-2011, 10:22 AM   #3
Snaab
Junior Member
Snaab began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Mar 2011
Device: Sony PRS-650
Thumbs up

Hi there,

The website of NRC Handelsblad has changed. Authentication is now required, but now you can download the digital edition even with a home subscription, so that's fair enough. I changed the script using the example from the New York Times:

Code:
#!/usr/bin/env  python2
# -*- coding: utf-8 -*-
#Based on veezh's original recipe and Kovid Goyal's New York Times recipe

__license__   = 'GPL v3'
__copyright__ = '2011, Snaab'

'''
www.nrc.nl
'''
import os, urllib2, zipfile
import time
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ptempfile import PersistentTemporaryFile


class NRCHandelsblad(BasicNewsRecipe):

    title = u'NRC Handelsblad'
    description = u'De ePaper-versie van NRC'
    language = 'nl'
    lang = 'nl-NL'
    needs_subscription = True

    __author__ = 'Snaab'

    conversion_options = {
        'no_default_epub_cover' : True
    }
    
    def get_browser(self):
        br = BasicNewsRecipe.get_browser()
        if self.username is not None and self.password is not None:
            br.open('http://login.nrc.nl/login')
            br.select_form(nr=0)
            br['username']   = self.username
            br['password'] = self.password
            br.submit()
        return br

    def build_index(self):
        
        today = time.strftime("%Y%m%d")
        
        domain = "http://digitaleeditie.nrc.nl"

        url = domain + "/digitaleeditie/helekrant/epub/nrc_" + today + ".epub"
        #print url

        try:
            br = self.get_browser()
            f = br.open(url)
        except:
            self.report_progress(0,_('Kan niet inloggen om editie te downloaden'))
            raise ValueError('Krant van vandaag nog niet beschikbaar')


        tmp = PersistentTemporaryFile(suffix='.epub')
        self.report_progress(0,_('downloading epub'))
        tmp.write(f.read())
        f.close()
        br.close()
        if zipfile.is_zipfile(tmp):
            try:
                zfile = zipfile.ZipFile(tmp.name, 'r')
                zfile.extractall(self.output_dir)
                self.report_progress(0,_('extracting epub'))
            except zipfile.BadZipfile:
                self.report_progress(0,_('BadZip error, continuing'))

        tmp.close()
        index = os.path.join(self.output_dir, 'metadata.opf')

        self.report_progress(1,_('epub downloaded and extracted'))

        return index
By the way, I added the exception handling for unzipping, because on Linux it threw an error during extraction although the ePub was extracted appropriately. Probably the archive is a little bit bad and windows doesn't care.

This way it works pretty well on my Linux machine (takes a bit long to run, although a walk to my mailbox takes longer), and thanks to an update a few months ago I now have my newspaper neatly in the "Periodics" section of my Sony E-reader.

Maybe someone can update this news feed, as the old one doesn't work anymore?

Cheers,

Snaab
Snaab is offline   Reply With Quote
Old 03-07-2011, 10:50 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,819
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Done .
kovidgoyal is offline   Reply With Quote
Reply

Tags
dutch, nieuws, nrc

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
NRC (Dutch newspaper) offering ePub version Nvidiot News 5 10-27-2009 06:18 AM
Dutch newspaper NRC first to offer iLiad e-paper edition ebookreaders News 11 07-08-2008 04:53 PM
NRC sells 500 Iliad + digital newspaper subscriptions grimo1re News 3 03-15-2008 08:49 PM


All times are GMT -4. The time now is 02:11 AM.


MobileRead.com is a privately owned, operated and funded community.