I’m enjoying reading CNN news on my Kindle but I’m having a problem with links to videos being included in the mobi file. Since the actual video isn’t included in the mobi file, these links bring up the next valid story rather than the one I expected to read.
I found this code from Starson17 in the Re-usable Code posting in the Recipe forum that can be used to remove videos:
def parse_feeds (self):
feeds = BasicNewsRecipe.parse_feeds(self)
for feed in feeds:
for article in feed.articles[:]:
print 'article.title is: ', article.title
if 'VIDEO' in article.title.upper() or 'GOAT' in article.url:
feed.articles.remove(article)
return feeds
I can’t get this code to block videos in the CNN recipe. I believe this is because the recipe code is looking at the link on the CNN feed page which doesn’t contain the word ‘video’. However, the feed page link is redirected to a link that does contain the word ‘video’.
Is there a way to change the CNN recipe to eliminate feeds based on the content of the redirected link?
Here are some examples.
Link to CNN feeds:
http://www.cnn.com/services/rss/
Example 1
A feed titled ‘See ship explode Hollywood style’ dated June 9 @ 10:29 AM contains this link:
http://rss.cnn.com/~r/rss/cnn_topsto...osion.cnn.html
Following the above link, the browser redirects to this link that contains “/video/”:
http://www.cnn.com/video/data/2.0/vi...osion.cnn.html
Example 2
A feed titled ‘Who is the suspected gunman?’ dated June 9 @ 11:07 AM contains this link:
http://rss.cnn.com/~r/rss/cnn_topsto...ified.cnn.html
Following the above line, the browser redirects to this link that contains “/video/”:
http://www.cnn.com/video/data/2.0/vi...ified.cnn.html