Well, I figured out a solution. From continuing to browse here, I saw someone else had used
Code:
use_embedded_content = False
to solve something that looked similar. With that, I've come up with a solution that's working:
Spoiler:
Code:
#!/usr/bin/env python
__license__ = 'GPL v3'
'''
Savage Minds
'''
import string
import re
from calibre.web.feeds.news import BasicNewsRecipe
class Savage_Minds(BasicNewsRecipe):
title = u'Savage Minds'
description = 'Notes and Queries in Anthropology - A Group Blog'
cover_url = 'http://savageminds.org/wp-content/themes/SM2009Test/images/sidebar/sidebox.jpg'
use_embedded_content = False
oldest_article = 7
max_articles_per_feed = 100
auto_cleanup = False
no_stylesheets = True
feeds = [(u'Savage Minds Entries', u'http://savageminds.org/feed/')]
keep_only_tags = [dict(name='div', attrs={'id':'content'})]
remove_tags = [dict(name='div', attrs={'class':'meta clear'}),
dict(name='div', attrs={'class':'snap_nopreview sharing robots-nocontent'}),
dict(name='div', attrs={'id':'respond'}),
dict(name='div', attrs={'class':'c-grav'}),
dict(name='span', attrs={'class':'c-permalink'})
]
It seems like even when I change "oldest_article" to, say, 14, or 20, Calibre still only downloads the latest two articles, but in the long run 7 days is fine, so I guess I'm not going to worry about it.