View Single Post
Old 06-09-2013, 12:34 PM   #1
jumafl
Enthusiast
jumafl began at the beginning.
 
Posts: 33
Karma: 10
Join Date: Apr 2012
Device: Amazon Kindle Paperwhite
How to exclude Video from CNN Recipe?

I’m enjoying reading CNN news on my Kindle but I’m having a problem with links to videos being included in the mobi file. Since the actual video isn’t included in the mobi file, these links bring up the next valid story rather than the one I expected to read.

I found this code from Starson17 in the Re-usable Code posting in the Recipe forum that can be used to remove videos:
def parse_feeds (self):
feeds = BasicNewsRecipe.parse_feeds(self)
for feed in feeds:
for article in feed.articles[:]:
print 'article.title is: ', article.title
if 'VIDEO' in article.title.upper() or 'GOAT' in article.url:
feed.articles.remove(article)
return feeds

I can’t get this code to block videos in the CNN recipe. I believe this is because the recipe code is looking at the link on the CNN feed page which doesn’t contain the word ‘video’. However, the feed page link is redirected to a link that does contain the word ‘video’.

Is there a way to change the CNN recipe to eliminate feeds based on the content of the redirected link?

Here are some examples.

Link to CNN feeds: http://www.cnn.com/services/rss/

Example 1

A feed titled ‘See ship explode Hollywood style’ dated June 9 @ 10:29 AM contains this link: http://rss.cnn.com/~r/rss/cnn_topsto...osion.cnn.html

Following the above link, the browser redirects to this link that contains “/video/”: http://www.cnn.com/video/data/2.0/vi...osion.cnn.html

Example 2

A feed titled ‘Who is the suspected gunman?’ dated June 9 @ 11:07 AM contains this link: http://rss.cnn.com/~r/rss/cnn_topsto...ified.cnn.html

Following the above line, the browser redirects to this link that contains “/video/”: http://www.cnn.com/video/data/2.0/vi...ified.cnn.html
jumafl is offline   Reply With Quote