View Single Post
Old 01-13-2010, 03:06 PM   #1142
XanthanGum
Connoisseur
XanthanGum began at the beginning.
 
XanthanGum's Avatar
 
Posts: 51
Karma: 10
Join Date: Dec 2008
Location: Germany
Device: SONY PRS-500
Need Recipe for MIT Technology Review

Hi,

I've given up on coming up with a good recipe for MIT's Technology Review at:

http://www.technologyreview.com/

Questions:

Some of the reports at MIT Technology Review are split across multiple pages. How do you deal with that?

In the middle of some of the articles, a line stating "Story continues below" occurs along with an advertisement. How do I cut that out?

The site has a print option for each article, but it uses the article id number in each of the print URLs. How would I deal with that?

I hope someone will improve my recipe and post it here so that I can see how to solve the problems I ran into.

Thanks...

XG

My recipe follows:

from calibre.web.feeds.news import BasicNewsRecipe

class MITtechnologyReview(BasicNewsRecipe):
title = u'MIT Technology Review'
__author__ = u'Xanthan Gum'
description = 'Technology news from MIT'

no_stylesheets = True

remove_tags_before = dict(id='articlebody')
remove_tags_after = dict(name='h3')

oldest_article = 7
max_articles_per_feed = 100

feeds = [(u'Top Stories', u'http://feeds.technologyreview.com/technology_review_top_stories'),
(u'Computing', u'http://feeds.technologyreview.com/technology_review_Computing'),
(u'Web', u'http://feeds.technologyreview.com/technology_review_Web'),
(u'Communications', u'http://feeds.technologyreview.com/technology_review_Communications'),
(u'Energy', u'http://feeds.technologyreview.com/technology_review_Energy'),
(u'Materials', u'http://feeds.technologyreview.com/technology_review_Materials'),
(u'Biomedicine', u'http://feeds.technologyreview.com/technology_review_Biotech'),
(u'Business', u'http://feeds.technologyreview.com/technology_review_Biztech')]
XanthanGum is offline