With the help of many recipies built by others I managed to extend the existing subscription based NRC Handelsblad recipe with all the available downloadable formats. It can be scheduled on the commandline and shouldn't change the existing epaper layouts. It will download all formats (currently epub, mobi and pdf) and import it in a single calibre database entry. Calibre-server probably needs to be running for the automatic import (I use ServiceEx for this), but the calibre frontend does not since it is not used to schedule the downloads ("Fetch News" is not used).
To configure for commandline operation:
- Make sure you either run calibre-server (by using ServiceEx of running the Calibre frontend continuosly)
- Put recipe in any folder (lets assume C:\ServiceEx\NRC Handelsblad.recipe)
- Create batch file (assume C:\ServiceEx\calibre-schedule.cmd) with the following content and don't forget to adjust where necessary:
Code:
"c:\Program Files (x86)\Calibre2\ebook-convert.exe" "C:\ServiceEX\NRC Handelsblad.recipe" .pdf --username cookiemonster --password cookies
The .pdf seems to be necessary for ebook-convert to run. Just ignore it.
- Set GET_MOBI and GET_PDF to True or False where appropriate
- Adjust ´--library-path d:\CalibreLibraryNRC` in the last lines if necessary, or leave it out when only using a single library.
- Configure the windows scheduler commandline:
Code:
at 16:20 /every:m,t,w,th,f c:\ServiceEx\calibre-schedule.cmd
at 7:20 /every:sa c:\ServiceEx\calibre-schedule.cmd
That should do it. Happy reading!
Code:
import re, zipfile, os
from calibre.ptempfile import PersistentTemporaryDirectory
from calibre.ptempfile import PersistentTemporaryFile
from urlparse import urlparse
GET_MOBI=True
GET_PDF=True
class DownloadAllFormats(BasicNewsRecipe):
title = u'NRC Handelsblad'
description = u'Loads all E-Book formats from NRC Handelblad site at once and imports them as a single book entry in the Calibre database.'
language = 'nl'
lang = 'nl-NL'
__author__ = 'Smmadge and many, many, many onthers!'
needs_subscription = True
def get_browser(self):
br = BasicNewsRecipe.get_browser()
if self.username is not None and self.password is not None:
br.open('http://login.nrc.nl/login')
br.select_form(nr=0)
br['username'] = self.username
br['password'] = self.password
br.submit()
return br
def build_index(self):
browser = self.get_browser()
domain = "http://digitaleeditie.nrc.nl"
today = time.strftime("%Y%m%d")
epublink = domain + "/digitaleeditie/helekrant/epub/nrc_" + today + ".epub"
mobilink = domain + "/digitaleeditie/helekrant/mobipocket/nrc_" + today + ".mobi"
pdflink = domain + "/digitaleeditie/helekrant/fullpdf/" + today + "_NRC_Handelsblad.pdf"
# Cheat calibre's recipe method, as in post from Starsom17
self.report_progress(0,_('downloading epub'))
response = browser.open(epublink)
dir = PersistentTemporaryDirectory()
epub_file = PersistentTemporaryFile(suffix='.epub',dir=dir)
epub_file.write(response.read())
epub_file.close()
zfile = zipfile.ZipFile(epub_file.name, 'r')
self.report_progress(0.1,_('extracting epub'))
zfile.extractall(self.output_dir)
epub_file.close()
index = os.path.join(self.output_dir, 'content.opf')
self.report_progress(0.2,_('epub downloaded and extracted'))
if (GET_MOBI):
self.report_progress(0.3,_('downloading mobi'))
mobi_file = PersistentTemporaryFile(suffix='.mobi',dir=dir)
browser.back()
response = browser.open(mobilink)
mobi_file.write(response.read())
mobi_file.close()
if (GET_PDF):
self.report_progress(0.4,_('downloading pdf'))
pdf_file = PersistentTemporaryFile(suffix='.pdf',dir=dir)
browser.back()
response = browser.open(pdflink)
pdf_file.write(response.read())
pdf_file.close()
# Get all formats into Calibre's database as one single book entry
self.report_progress(0.6,_('Adding files to Calibre db'))
cmd = "calibredb add -1 --library-path d:\CalibreLibraryNRC " + dir
os.system(cmd)
return index