Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 11-05-2012, 02:38 PM   #1
dncohen
Junior Member
dncohen began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Nov 2012
Device: kindle fire
Seeking help with simple recipe for seedmagazine.com

Hi All,

This is my first recipe and first python code. That may explain any possibly stupid questions.

I'm trying to emulate existing recipes to get articles from a site that has no RSS feed. In this case, http://www.seedmagazine.com.

I've looked at their source HTML and, so far as I understand it, to parse the index I want every link on the page that goes to an article. That means a URL that starts http://seedmagazine.com/content/article/... (Actually, I want to get the print version of those articles, which is a pretty easy substitution.

I'm attaching my current recipe. It almost works, but instead of getting all the article links on the main page, it gets only the first two. I can't seem to figure out why. Shouldn't soup.findAll('a') return all the anchor tags on the page?

I'd appreciate any advice to get past that problem. And any advice in general because I really don't know how to put the finishing touches on this recipe.

Thanks! -Dave

Code:
import string, re
from calibre import strftime
from calibre.web.feeds.recipes import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup


class seedmagazine(BasicNewsRecipe):
    title = u'Seed Magazine'
    description = u'seedmagazine.com'
    
    oldest_article = 31
    max_articles_per_feed = 5 # keep this number small until recipe works


    def parse_index(self):
        articles = []
        feeds = []
        seen = set([])
        
        soup = self.index_to_soup('http://www.seedmagazine.com')

        for link in soup.findAll('a'):
            url = link['href']
            title = self.tag_to_string(link)
            
            if (title and url.find('/content/article/') > 0) :
                articles.append({'title': title,
                                 'url': self.print_version(url),
                                 })

        if (articles):
            feeds.append((self.title, articles))

        return feeds

    
        
    
    def print_version(self, url):
        return url.replace('/article/', '/print/')
dncohen is offline   Reply With Quote
Old 11-06-2012, 12:31 AM   #2
Krittika Goyal
Vox calibre
Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.
 
Krittika Goyal's Avatar
 
Posts: 412
Karma: 1175230
Join Date: Jan 2009
Device: Sony reader prs700, kobo
As far as i can see your code works. whatr command did you use to compile it? Use
ebook-convert see.recipe .epub --debug-pipeline p -vv

If the command you used also had the word "test" in it only 2 articles will show up.

I have attached the epub produced by running your code with the above command.
Attached Files
File Type: epub see.epub (400.1 KB, 144 views)
Krittika Goyal is offline   Reply With Quote
Advert
Old 11-06-2012, 01:58 AM   #3
dncohen
Junior Member
dncohen began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Nov 2012
Device: kindle fire
Oh thanks for that. I was running the command with --test (I had copied it that way). I removed the --test and now I'm on my way again.
dncohen is offline   Reply With Quote
Old 11-06-2012, 06:04 AM   #4
Krittika Goyal
Vox calibre
Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.
 
Krittika Goyal's Avatar
 
Posts: 412
Karma: 1175230
Join Date: Jan 2009
Device: Sony reader prs700, kobo
congratulations on your first python recipe. may you have many more
Krittika Goyal is offline   Reply With Quote
Old 06-21-2013, 02:04 PM   #5
ehead
Member
ehead began at the beginning.
 
Posts: 13
Karma: 10
Join Date: Apr 2013
Device: Kobo Aura HD
Hey, did you finish this? I was looking for a recipe for Seed myself.
ehead is offline   Reply With Quote
Advert
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Need a simple tweak to a recipe rij73 Recipes 6 05-24-2012 12:56 AM
Trio of Picture Books - Simple Animals, Simple Shapes, and You're My Baby! Manley Peterson Self-Promotions by Authors and Publishers 5 01-06-2012 08:55 PM
Simple download from rss url recipe BloodOmen Recipes 0 02-16-2011 09:21 PM
Simple Recipe Breaks in Latest Version Tegan Recipes 6 02-14-2011 10:48 AM
erm, simple question , hope for simple answer! al zymers Amazon Kindle 5 09-25-2010 01:01 PM


All times are GMT -4. The time now is 09:15 AM.


MobileRead.com is a privately owned, operated and funded community.