Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 06-09-2013, 12:34 PM   #1
jumafl
Enthusiast
jumafl began at the beginning.
 
Posts: 32
Karma: 10
Join Date: Apr 2012
Device: Amazon Kindle Paperwhite
How to exclude Video from CNN Recipe?

I’m enjoying reading CNN news on my Kindle but I’m having a problem with links to videos being included in the mobi file. Since the actual video isn’t included in the mobi file, these links bring up the next valid story rather than the one I expected to read.

I found this code from Starson17 in the Re-usable Code posting in the Recipe forum that can be used to remove videos:
def parse_feeds (self):
feeds = BasicNewsRecipe.parse_feeds(self)
for feed in feeds:
for article in feed.articles[:]:
print 'article.title is: ', article.title
if 'VIDEO' in article.title.upper() or 'GOAT' in article.url:
feed.articles.remove(article)
return feeds

I can’t get this code to block videos in the CNN recipe. I believe this is because the recipe code is looking at the link on the CNN feed page which doesn’t contain the word ‘video’. However, the feed page link is redirected to a link that does contain the word ‘video’.

Is there a way to change the CNN recipe to eliminate feeds based on the content of the redirected link?

Here are some examples.

Link to CNN feeds: http://www.cnn.com/services/rss/

Example 1

A feed titled ‘See ship explode Hollywood style’ dated June 9 @ 10:29 AM contains this link: http://rss.cnn.com/~r/rss/cnn_topsto...osion.cnn.html

Following the above link, the browser redirects to this link that contains “/video/”: http://www.cnn.com/video/data/2.0/vi...osion.cnn.html

Example 2

A feed titled ‘Who is the suspected gunman?’ dated June 9 @ 11:07 AM contains this link: http://rss.cnn.com/~r/rss/cnn_topsto...ified.cnn.html

Following the above line, the browser redirects to this link that contains “/video/”: http://www.cnn.com/video/data/2.0/vi...ified.cnn.html
jumafl is offline   Reply With Quote
Old 06-09-2013, 12:36 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,844
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You need to implement preprocess_html, detect the video element and either remove it or return None, in which case the whole article will be skipped.
kovidgoyal is online now   Reply With Quote
Advert
Old 06-09-2013, 07:43 PM   #3
jumafl
Enthusiast
jumafl began at the beginning.
 
Posts: 32
Karma: 10
Join Date: Apr 2012
Device: Amazon Kindle Paperwhite
Thanks Kovid. I was able to find where preprocess_html is mentioned in the documentation but couldn't find any examples of how to use it to filter articles based on the redirected URL. I searched other recipes hoping to find a working example but so far have had no success.

If you or one of the other experts in this forum have time to take a look at this issue, I would appreciate the help.
jumafl is offline   Reply With Quote
Old 06-09-2013, 08:15 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,844
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
If you wish to filter based on URL, you can implement get_article_url and return None for those articles you want skipped.
kovidgoyal is online now   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to exclude strings before and after ElMiko Sigil 14 07-21-2012 06:34 PM
How can I exclude all the images from NYT? Steven630 Recipes 1 05-11-2012 08:54 AM
Exclude some parts from build MartinJT Calibre 4 09-15-2011 08:39 AM
Recipe Request: CNN Expansion der_geistmx Recipes 2 03-18-2011 01:06 AM
Exclude files from indexing? HansTWN iRex 8 04-20-2010 05:02 AM


All times are GMT -4. The time now is 08:53 PM.


MobileRead.com is a privately owned, operated and funded community.