View Single Post
Old 03-20-2012, 11:23 AM   #2
apiontek
Member
apiontek began at the beginning.
 
apiontek's Avatar
 
Posts: 18
Karma: 10
Join Date: Mar 2012
Location: Queens, NY
Device: Kobo Glo HD
Lightbulb Solved.

Well, I figured out a solution. From continuing to browse here, I saw someone else had used
Code:
use_embedded_content = False
to solve something that looked similar. With that, I've come up with a solution that's working:

Spoiler:
Code:
#!/usr/bin/env  python

__license__   = 'GPL v3'
'''
Savage Minds
'''
import string
import re

from calibre.web.feeds.news import BasicNewsRecipe

class Savage_Minds(BasicNewsRecipe):
    title          = u'Savage Minds'
    description = 'Notes and Queries in Anthropology - A Group Blog'
    cover_url       = 'http://savageminds.org/wp-content/themes/SM2009Test/images/sidebar/sidebox.jpg'
    use_embedded_content = False
    oldest_article = 7
    max_articles_per_feed = 100
    auto_cleanup = False
    no_stylesheets = True

    feeds          = [(u'Savage Minds Entries', u'http://savageminds.org/feed/')]

    keep_only_tags    = [dict(name='div', attrs={'id':'content'})]
    remove_tags = [dict(name='div', attrs={'class':'meta clear'}),
        dict(name='div', attrs={'class':'snap_nopreview sharing robots-nocontent'}),
        dict(name='div', attrs={'id':'respond'}),
        dict(name='div', attrs={'class':'c-grav'}),
        dict(name='span', attrs={'class':'c-permalink'})
        ]


It seems like even when I change "oldest_article" to, say, 14, or 20, Calibre still only downloads the latest two articles, but in the long run 7 days is fine, so I guess I'm not going to worry about it.
apiontek is offline   Reply With Quote