Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 05-28-2011, 04:19 PM   #1
Shuichiro
Junior Member
Shuichiro began at the beginning.
 
Posts: 4
Karma: 10
Join Date: May 2011
Device: Kindle
Psychology Today website changed

Can someone please help with altering the default installed "Psychology Today" recipe? I miss reading my articles and whenever I download the feed the articles fail to download.


from calibre.ptempfile import PersistentTemporaryFile
from calibre.web.feeds.news import BasicNewsRecipe

class AdvancedUserRecipe1275708473(BasicNewsRecipe):
title = u'Psychology Today'
_author__ = 'rty'
publisher = u'www.psychologytoday.com'
category = u'Psychology'
max_articles_per_feed = 100
remove_javascript = True
use_embedded_content = False
no_stylesheets = True
language = 'en'
temp_files = []
articles_are_obfuscated = True
remove_tags = [
dict(name='div', attrs={'class':['print-source_url','field-items','print-footer']}),
dict(name='span', attrs={'class':'print-footnote'}),
]
remove_tags_before = dict(name='h1', attrs={'class':'print-title'})
remove_tags_after = dict(name='div', attrs={'class':['field-items','print-footer']})

feeds = [(u'Contents', u'http://www.psychologytoday.com/articles/index.rss')]

def get_article_url(self, article):
return article.get('link', None)

def get_obfuscated_article(self, url):
br = self.get_browser()
br.open(url)
response = br.follow_link(url_regex = r'/print/[0-9]+', nr = 0)
html = response.read()
self.temp_files.append(PersistentTemporaryFile('_f a.html'))
self.temp_files[-1].write(html)
self.temp_files[-1].close()
return self.temp_files[-1].name

def get_cover_url(self):
index = 'http://www.psychologytoday.com/magazine/'
soup = self.index_to_soup(index)
for image in soup.findAll('img',{ "class" : "imagefield imagefield-field_magazine_cover" }):
return image['src'] + '.jpg'
return None
Shuichiro is offline   Reply With Quote
Advert
Old 05-29-2011, 08:57 AM   #2
vjplo
Enthusiast
vjplo is no ebook tyro.vjplo is no ebook tyro.vjplo is no ebook tyro.vjplo is no ebook tyro.vjplo is no ebook tyro.vjplo is no ebook tyro.vjplo is no ebook tyro.vjplo is no ebook tyro.vjplo is no ebook tyro.vjplo is no ebook tyro.
 
Posts: 25
Karma: 1304
Join Date: Jan 2011
Device: Literati
me, too

I miss the downloads. Only the cover comes through.
vjplo is offline   Reply With Quote
Old 06-04-2011, 03:41 PM   #3
Shuichiro
Junior Member
Shuichiro began at the beginning.
 
Posts: 4
Karma: 10
Join Date: May 2011
Device: Kindle
bumpy
Shuichiro is offline   Reply With Quote
Old 08-11-2011, 06:54 AM   #4
dstockinger
Austrian
dstockinger began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Jan 2011
Location: Styria, Austria
Device: Amazon Kindle WiFi
Any progress with this problem??? Can i, as a simple user, do something to speed things up?
dstockinger is offline   Reply With Quote
Old 08-11-2011, 10:37 AM   #5
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by dstockinger View Post
Any progress with this problem??? Can i, as a simple user, do something to speed things up?
There are currently about 1000 recipes in Calibre. The web sites they scrape probably change on average every few months, so keeping them all in working order is a huge job. Most recipes were contributed by volunteers and aren't actively maintained. If the original author isn't reading his own recipe, or doesn't see a post here requesting a fix, or some other reader of that recipe doesn't want to do the job, it just doesn't get done.

My suggestion is to do it yourself. You can start by removing all the obfuscation and remove tags as follows:
Spoiler:
Code:
from calibre.ptempfile import PersistentTemporaryFile
from calibre.web.feeds.news import BasicNewsRecipe

class AdvancedUserRecipe1275708473(BasicNewsRecipe):
    title          = u'Psychology Today'
    _author__ = 'rty'
    publisher = u'www.psychologytoday.com'
    category = u'Psychology'
    max_articles_per_feed = 100
    remove_javascript = True
    use_embedded_content   = False
    no_stylesheets = True
    language = 'en'
    temp_files = []

    feeds          = [(u'Contents', u'http://www.psychologytoday.com/articles/index.rss')]

    def get_article_url(self, article):
       return article.get('link',  None)

    def get_cover_url(self):
        index = 'http://www.psychologytoday.com/magazine/'
        soup = self.index_to_soup(index)
        for image in soup.findAll('img',{ "class" : "imagefield imagefield-field_magazine_cover" }):
              return image['src'] + '.jpg'
        return None


See what this gives you. If it retrieves content, then redo the remove tag portion. If not, see if the obfuscation settings need work. I haven't tested this at all, I'm just pointing you to a start. Read the sticky at the bottom for more links to info on recipes.
Starson17 is offline   Reply With Quote
Advert
Old 08-12-2011, 02:03 PM   #6
dstockinger
Austrian
dstockinger began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Jan 2011
Location: Styria, Austria
Device: Amazon Kindle WiFi
Yes, Calibre downloads "something" with Your script, titles etc., but not whole articles...!!!

Seems as if they (=Psychology Today) not only changed format, but also switched to multiple feeds according different themes.

Since i am not experienced in Python and i also do not have enough time, the only thing i do is to wait, if someone else with better knowledge in Python finds time and motivation to fix the recipe. :-(
dstockinger is offline   Reply With Quote
Old 08-26-2011, 05:27 PM   #7
dstockinger
Austrian
dstockinger began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Jan 2011
Location: Styria, Austria
Device: Amazon Kindle WiFi
Anyone any idea what to do? I miss Psychology Today, and i think i'm not the only one...
dstockinger is offline   Reply With Quote
Old 08-26-2011, 07:24 PM   #8
rogerx
Enthusiast
rogerx doesn't litterrogerx doesn't litterrogerx doesn't litter
 
Posts: 29
Karma: 244
Join Date: Aug 2011
Location: North Pole, Alaska
Device: Kindle DXG
I took a quick look at this (... because I *really do* care)...

... and find the RSS link leads to, is not a true RSS Feed. It is a RSS feed, but not of URLS leading to one published article per URL. The URL is a topic description leading to a collection of articles group by the topic, either recently published or past articles.

The topic of the URL leading to the collection of articles seems to be determined by their agenda.

A better suggestion, would be to Google "psychology rss". There seems to be better RSS feeds for your purpose!
rogerx is offline   Reply With Quote
Old 08-26-2011, 07:26 PM   #9
rogerx
Enthusiast
rogerx doesn't litterrogerx doesn't litterrogerx doesn't litter
 
Posts: 29
Karma: 244
Join Date: Aug 2011
Location: North Pole, Alaska
Device: Kindle DXG
... oh and, in case you haven't realized yet, this specific magazine seems to be strongly focused on sex and good looking girls.
rogerx is offline   Reply With Quote
Old 08-31-2011, 03:06 PM   #10
dstockinger
Austrian
dstockinger began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Jan 2011
Location: Styria, Austria
Device: Amazon Kindle WiFi
Unhappy No hope in sight for Psychology Today

Thank You very much for the "quick look".

Seems to be a general problem with restructuring of this site, no real hope.

Ah, and - yes, they sometimes have good looking girls on their frontpage...
But it still is a serious, sure popular science magazine (i subscribed to the printed issues).

Greetings from Austria, Dietmar
dstockinger is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Psychology Today recipe is recently failing to pull articles. Shuichiro Recipes 1 08-06-2011 06:23 PM
Psychology today news feed failing to download Shuichiro Recipes 1 05-14-2011 06:11 AM
Shipping date on Kindle main page changed today bob coxner Amazon Kindle 7 09-15-2010 05:45 PM
Reference Freud, Sigmund: Dream Psychology, v1, 21 Oct 2007. Patricia Kindle Books 2 01-09-2010 07:36 AM
Reference Freud, Sigmund: Dream Psychology, v1, 21 Oct 2007. Patricia BBeB/LRF Books 0 10-21-2007 07:11 PM


All times are GMT -4. The time now is 10:56 PM.


MobileRead.com is a privately owned, operated and funded community.