View Single Post
Old 08-11-2011, 09:37 AM   #5
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by dstockinger View Post
Any progress with this problem??? Can i, as a simple user, do something to speed things up?
There are currently about 1000 recipes in Calibre. The web sites they scrape probably change on average every few months, so keeping them all in working order is a huge job. Most recipes were contributed by volunteers and aren't actively maintained. If the original author isn't reading his own recipe, or doesn't see a post here requesting a fix, or some other reader of that recipe doesn't want to do the job, it just doesn't get done.

My suggestion is to do it yourself. You can start by removing all the obfuscation and remove tags as follows:
Spoiler:
Code:
from calibre.ptempfile import PersistentTemporaryFile
from calibre.web.feeds.news import BasicNewsRecipe

class AdvancedUserRecipe1275708473(BasicNewsRecipe):
    title          = u'Psychology Today'
    _author__ = 'rty'
    publisher = u'www.psychologytoday.com'
    category = u'Psychology'
    max_articles_per_feed = 100
    remove_javascript = True
    use_embedded_content   = False
    no_stylesheets = True
    language = 'en'
    temp_files = []

    feeds          = [(u'Contents', u'http://www.psychologytoday.com/articles/index.rss')]

    def get_article_url(self, article):
       return article.get('link',  None)

    def get_cover_url(self):
        index = 'http://www.psychologytoday.com/magazine/'
        soup = self.index_to_soup(index)
        for image in soup.findAll('img',{ "class" : "imagefield imagefield-field_magazine_cover" }):
              return image['src'] + '.jpg'
        return None


See what this gives you. If it retrieves content, then redo the remove tag portion. If not, see if the obfuscation settings need work. I haven't tested this at all, I'm just pointing you to a start. Read the sticky at the bottom for more links to info on recipes.
Starson17 is offline   Reply With Quote