|
|
Thread Tools | Search this Thread |
10-17-2012, 03:26 AM | #1 |
Junior Member
Posts: 7
Karma: 10
Join Date: Oct 2012
Device: PRS-T1
|
New NRC Handelsblad recipe with all formats downloaded in single calibredb entry
With the help of many recipies built by others I managed to extend the existing subscription based NRC Handelsblad recipe with all the available downloadable formats. It can be scheduled on the commandline and shouldn't change the existing epaper layouts. It will download all formats (currently epub, mobi and pdf) and import it in a single calibre database entry. Calibre-server probably needs to be running for the automatic import (I use ServiceEx for this), but the calibre frontend does not since it is not used to schedule the downloads ("Fetch News" is not used).
To configure for commandline operation: - Make sure you either run calibre-server (by using ServiceEx of running the Calibre frontend continuosly) - Put recipe in any folder (lets assume C:\ServiceEx\NRC Handelsblad.recipe) - Create batch file (assume C:\ServiceEx\calibre-schedule.cmd) with the following content and don't forget to adjust where necessary: Code:
"c:\Program Files (x86)\Calibre2\ebook-convert.exe" "C:\ServiceEX\NRC Handelsblad.recipe" .pdf --username cookiemonster --password cookies - Set GET_MOBI and GET_PDF to True or False where appropriate - Adjust ´--library-path d:\CalibreLibraryNRC` in the last lines if necessary, or leave it out when only using a single library. - Configure the windows scheduler commandline: Code:
at 16:20 /every:m,t,w,th,f c:\ServiceEx\calibre-schedule.cmd at 7:20 /every:sa c:\ServiceEx\calibre-schedule.cmd Code:
import re, zipfile, os from calibre.ptempfile import PersistentTemporaryDirectory from calibre.ptempfile import PersistentTemporaryFile from urlparse import urlparse GET_MOBI=True GET_PDF=True class DownloadAllFormats(BasicNewsRecipe): title = u'NRC Handelsblad' description = u'Loads all E-Book formats from NRC Handelblad site at once and imports them as a single book entry in the Calibre database.' language = 'nl' lang = 'nl-NL' __author__ = 'Smmadge and many, many, many onthers!' needs_subscription = True def get_browser(self): br = BasicNewsRecipe.get_browser() if self.username is not None and self.password is not None: br.open('http://login.nrc.nl/login') br.select_form(nr=0) br['username'] = self.username br['password'] = self.password br.submit() return br def build_index(self): browser = self.get_browser() domain = "http://digitaleeditie.nrc.nl" today = time.strftime("%Y%m%d") epublink = domain + "/digitaleeditie/helekrant/epub/nrc_" + today + ".epub" mobilink = domain + "/digitaleeditie/helekrant/mobipocket/nrc_" + today + ".mobi" pdflink = domain + "/digitaleeditie/helekrant/fullpdf/" + today + "_NRC_Handelsblad.pdf" # Cheat calibre's recipe method, as in post from Starsom17 self.report_progress(0,_('downloading epub')) response = browser.open(epublink) dir = PersistentTemporaryDirectory() epub_file = PersistentTemporaryFile(suffix='.epub',dir=dir) epub_file.write(response.read()) epub_file.close() zfile = zipfile.ZipFile(epub_file.name, 'r') self.report_progress(0.1,_('extracting epub')) zfile.extractall(self.output_dir) epub_file.close() index = os.path.join(self.output_dir, 'content.opf') self.report_progress(0.2,_('epub downloaded and extracted')) if (GET_MOBI): self.report_progress(0.3,_('downloading mobi')) mobi_file = PersistentTemporaryFile(suffix='.mobi',dir=dir) browser.back() response = browser.open(mobilink) mobi_file.write(response.read()) mobi_file.close() if (GET_PDF): self.report_progress(0.4,_('downloading pdf')) pdf_file = PersistentTemporaryFile(suffix='.pdf',dir=dir) browser.back() response = browser.open(pdflink) pdf_file.write(response.read()) pdf_file.close() # Get all formats into Calibre's database as one single book entry self.report_progress(0.6,_('Adding files to Calibre db')) cmd = "calibredb add -1 --library-path d:\CalibreLibraryNRC " + dir os.system(cmd) return index Last edited by smmadge; 04-02-2013 at 12:08 PM. Reason: Minor edit |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Recipe for NRC Handelsblad (RSS feeds) | veezh | Recipes | 5 | 03-29-2012 04:39 AM |
How to combine two formats of same ebook into one entry? | riverteeth | Library Management | 2 | 04-06-2011 06:20 AM |
NRC Handelsblad | veezh | Recipes | 3 | 03-07-2011 10:50 AM |
single entry per news feed | repudi8or | Calibre | 1 | 09-21-2010 04:24 AM |
new single-entry TOC for Kobo compatibility | Stinger | Calibre | 3 | 05-29-2010 01:16 AM |