View Single Post
Old 11-15-2010, 09:11 AM   #8
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Recipe to download an EPUB from feed

So that this doesn't get lost, I'm going to repost it here. It's a recipe that grabs a single link to an EPUB.

The recipe modifies build_index, which is the method that gets the masthead image and cover, parses the feed for articles, retrieves the articles, removes tags from articles, etc. All of those steps ultimately produce a local directory structure that looks like an unzipped EPUB.

The recipe grabs the link to one EPUB (the first in the RSS feed), saves the EPUB locally, extracts it, and passes the result back into the recipe system as though all the other steps had been completed normally.

To use the recipe, just modify these lines:

epub_feed = "http://feeds.feedburner.com/NowEpubEditions"
soup = self.index_to_soup(epub_feed)
url = soup.find(name = 'feedburnerriglink').string

so that "url" points to an EPUB as in: "http://some.place.com/epubfile.epub"
The sample below grabs the first EPUB in an RSS feed, but you can just supply a single URL directly or grab it from the front page of a newspaper. I've posted a complete recipe to emphasize that the normal recipe methods, like "feeds", "remove_tags", etc. should all be omitted.

Spoiler:
Quote:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#Based on Lars Jacob's Taz Digiabo recipe

__license__ = 'GPL v3'
__copyright__ = '2010, Starson17'

import os, urllib2, zipfile
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ptempfile import PersistentTemporaryFile

class NowToronto(BasicNewsRecipe):
title = u'Now Toronto'
description = u'Now Toronto'
__author__ = 'Starson17'
conversion_options = {
'no_default_epub_cover' : True
}

def build_index(self):
epub_feed = "http://feeds.feedburner.com/NowEpubEditions"
soup = self.index_to_soup(epub_feed)
url = soup.find(name = 'feedburnerriglink').string
f = urllib2.urlopen(url)
tmp = PersistentTemporaryFile(suffix='.epub')
self.report_progress(0,_('downloading epub'))
tmp.write(f.read())
tmp.close()
zfile = zipfile.ZipFile(tmp.name, 'r')
self.report_progress(0,_('extracting epub'))
zfile.extractall(self.output_dir)
tmp.close()
index = os.path.join(self.output_dir, 'content.opf')
self.report_progress(1,_('epub downloaded and extracted'))
return index
Starson17 is offline   Reply With Quote