Quote:
Originally Posted by willswords
Hi there. I'm new to Calibre and was wondering if someone could help me with my recipe for the Deseret News (Salt Lake City, Utah, USA Newspaper, http://desnews.com ). I've cobbled something together from what I have seen in other recipes, but I can't get it to use the mobile url instead of the regular one. The stories come through, but with all the extra stuff I don't want. The mobile versions of the articles look pretty clean though, but I must be doing something wrong because it isn't using the mobile url for the stories.
Here is what I have so far:
Spoiler:
Code:
from calibre.web.feeds.news import BasicNewsRecipe
class AdvancedUserRecipe1284222826(BasicNewsRecipe):
title = u'Deseret News mobile'
__author__ = 'WillsWords'
description = 'Deseret News selected feeds'
category = 'news, politics, USA, Utah'
oldest_article = 7
max_articles_per_feed = 100
no_stylesheets = True
remove_javascript = True
masthead_url = "http://www.deseretnews.com/media/img/icons/dn-masthead-logo.gif"
feeds = [(u'Top News', u'http://www.deseretnews.com/home/index.rss'), (u'Utah', u'http://www.deseretnews.com/utah/index.rss'), (u'Movies', u'http://www.deseretnews.com/movies/index.rss'), (u'LDS Newsline', u'http://www.deseretnews.com/ldsnews/index.rss'), (u'Sports', u'http://www.deseretnews.com/sports/index.rss')]
def print_version(self, url):
split1 = url.split("/")
#url1 = split1[0]
#url2 = split1[1]
url3 = split1[2]
url4 = split1[3]
url5 = split1[4]
url6 = split1[5]
#example of link to convert
#http://www.deseretnews.com/article/700064426/Elizabeth-Smarts-father-joins-bike-ride-to-lobby-for-laws-protecting-children-from-predators.html
#http://www.deseretnews.com/mobile/article/700064426/Elizabeth-Smarts-father-joins-bike-ride-to-lobby-for-laws-protecting-children-from-predators.html
print_url = 'http://' + url3 + '/mobile/' + url4 + '/' + url5 + '/' + url6
return print_url
|
here you go...
take note of the comments in the following code:
Spoiler:
Code:
from calibre.web.feeds.news import BasicNewsRecipe
class AdvancedUserRecipe1284222826(BasicNewsRecipe):
title = u'Deseret News mobile'
__author__ = 'WillsWords'
description = 'Deseret News selected feeds'
category = 'news, politics, USA, Utah'
oldest_article = 7
max_articles_per_feed = 100
no_stylesheets = True
remove_javascript = True
#I ADDED KEY_ONLY_TAGS to only keep the content section on the mobile page
keep_only_tags = [dict(name='div', attrs={'id':['content']})]
#I ADDED REMOVE TAGS TO GET RID OF THE COMMENTS AND THE TOOL BAR AT THE TOP
remove_tags = [dict(name='div', attrs={'id':['tools','story-comments']})]
masthead_url = "http://www.deseretnews.com/media/img/icons/dn-masthead-logo.gif"
feeds = [(u'Top News', u'http://www.deseretnews.com/home/index.rss'), (u'Utah', u'http://www.deseretnews.com/utah/index.rss'), (u'Movies', u'http://www.deseretnews.com/movies/index.rss'), (u'LDS Newsline', u'http://www.deseretnews.com/ldsnews/index.rss'), (u'Sports', u'http://www.deseretnews.com/sports/index.rss')]
#I FIXED YOUR INDENT it was all the way to the left it has to be within the class so align it with the indent
#of title, remove_javascript, ect...
def print_version(self, url):
split1 = url.split("/")
url3 = split1[2]
url4 = split1[3]
url5 = split1[4]
url6 = split1[5]
#example of link to convert
#http://www.deseretnews.com/article/700064426/Elizabeth-Smarts-father-joins-bike-ride-to-lobby-for-laws-protecting-children-from-predators.html
#http://www.deseretnews.com/mobile/article/700064426/Elizabeth-Smarts-father-joins-bike-ride-to-lobby-for-laws-protecting-children-from-predators.html
print_url = 'http://' + url3 + '/mobile/' + url4 + '/' + url5 + '/' + url6
#I ADDED THE FOLLOWING TO SHOW YOU IN THE LOG FILE WHAT THE ACTUAL PRINT URL IS. Once you see it showing the
#the currect url then you should be good to go other than just cleaning up a few tags by using keep only and remove
print 'THIS URL WILL PRINT: ', print_url
return print_url