Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 10-17-2012, 03:26 AM   #1
smmadge
Junior Member
smmadge began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Oct 2012
Device: PRS-T1
New NRC Handelsblad recipe with all formats downloaded in single calibredb entry

With the help of many recipies built by others I managed to extend the existing subscription based NRC Handelsblad recipe with all the available downloadable formats. It can be scheduled on the commandline and shouldn't change the existing epaper layouts. It will download all formats (currently epub, mobi and pdf) and import it in a single calibre database entry. Calibre-server probably needs to be running for the automatic import (I use ServiceEx for this), but the calibre frontend does not since it is not used to schedule the downloads ("Fetch News" is not used).

To configure for commandline operation:
- Make sure you either run calibre-server (by using ServiceEx of running the Calibre frontend continuosly)
- Put recipe in any folder (lets assume C:\ServiceEx\NRC Handelsblad.recipe)
- Create batch file (assume C:\ServiceEx\calibre-schedule.cmd) with the following content and don't forget to adjust where necessary:
Code:
"c:\Program Files (x86)\Calibre2\ebook-convert.exe" "C:\ServiceEX\NRC Handelsblad.recipe" .pdf --username cookiemonster --password cookies
The .pdf seems to be necessary for ebook-convert to run. Just ignore it.
- Set GET_MOBI and GET_PDF to True or False where appropriate
- Adjust ´--library-path d:\CalibreLibraryNRC` in the last lines if necessary, or leave it out when only using a single library.
- Configure the windows scheduler commandline:
Code:
at 16:20 /every:m,t,w,th,f c:\ServiceEx\calibre-schedule.cmd
at 7:20 /every:sa c:\ServiceEx\calibre-schedule.cmd
That should do it. Happy reading!

Code:
import re, zipfile, os
from calibre.ptempfile import PersistentTemporaryDirectory
from calibre.ptempfile import PersistentTemporaryFile
from urlparse import urlparse

GET_MOBI=True
GET_PDF=True

class DownloadAllFormats(BasicNewsRecipe):
    title = u'NRC Handelsblad'
    description = u'Loads all E-Book formats from NRC Handelblad site at once and imports them as a single book entry in the Calibre database.'
    language = 'nl'
    lang = 'nl-NL'
    __author__ = 'Smmadge and many, many, many onthers!'
    needs_subscription = True
 
    def get_browser(self):
        br = BasicNewsRecipe.get_browser()
        if self.username is not None and self.password is not None:
            br.open('http://login.nrc.nl/login')
            br.select_form(nr=0)
            br['username'] = self.username
            br['password'] = self.password
            br.submit()
	    return br

    def build_index(self):
        browser = self.get_browser()
        domain = "http://digitaleeditie.nrc.nl"
        today = time.strftime("%Y%m%d")
        epublink = domain + "/digitaleeditie/helekrant/epub/nrc_" + today + ".epub"
        mobilink = domain + "/digitaleeditie/helekrant/mobipocket/nrc_" + today + ".mobi"
        pdflink = domain + "/digitaleeditie/helekrant/fullpdf/" + today + "_NRC_Handelsblad.pdf"

        # Cheat calibre's recipe method, as in post from Starsom17
        self.report_progress(0,_('downloading epub'))
        response = browser.open(epublink)
        dir = PersistentTemporaryDirectory()
        epub_file = PersistentTemporaryFile(suffix='.epub',dir=dir)
        epub_file.write(response.read())
        epub_file.close()
        zfile = zipfile.ZipFile(epub_file.name, 'r')
        self.report_progress(0.1,_('extracting epub'))
        zfile.extractall(self.output_dir)
        epub_file.close()
        index = os.path.join(self.output_dir, 'content.opf')
        self.report_progress(0.2,_('epub downloaded and extracted'))

        if (GET_MOBI):
           self.report_progress(0.3,_('downloading mobi'))
           mobi_file = PersistentTemporaryFile(suffix='.mobi',dir=dir)
           browser.back()
           response = browser.open(mobilink)
           mobi_file.write(response.read())
           mobi_file.close()

        if (GET_PDF):
           self.report_progress(0.4,_('downloading pdf'))
           pdf_file = PersistentTemporaryFile(suffix='.pdf',dir=dir)
           browser.back()
           response = browser.open(pdflink)
           pdf_file.write(response.read())
           pdf_file.close()

        # Get all formats into Calibre's database as one single book entry
        self.report_progress(0.6,_('Adding files to Calibre db'))
        cmd = "calibredb add -1 --library-path d:\CalibreLibraryNRC " + dir
        os.system(cmd)

        return index

Last edited by smmadge; 04-02-2013 at 12:08 PM. Reason: Minor edit
smmadge is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Recipe for NRC Handelsblad (RSS feeds) veezh Recipes 5 03-29-2012 04:39 AM
How to combine two formats of same ebook into one entry? riverteeth Library Management 2 04-06-2011 06:20 AM
NRC Handelsblad veezh Recipes 3 03-07-2011 10:50 AM
single entry per news feed repudi8or Calibre 1 09-21-2010 04:24 AM
new single-entry TOC for Kobo compatibility Stinger Calibre 3 05-29-2010 01:16 AM


All times are GMT -4. The time now is 04:49 PM.


MobileRead.com is a privately owned, operated and funded community.