View Single Post
Old 07-05-2012, 02:37 PM   #1
_reader
Member
_reader doesn't litter_reader doesn't litter
 
Posts: 24
Karma: 142
Join Date: Sep 2010
Device: K3, KPW
New Recipe - American Thinker

American Thinker is a daily internet publication devoted to the thoughtful exploration of issues of importance to Americans. Contributors are accomplished in fields beyond journalism and animated to write for the general public out of concern for the complex and morally significant questions on the national agenda.

Recipe submitted for review / comments / inclusion.

Spoiler:
Code:
from calibre.web.feeds.news import BasicNewsRecipe
class AmericanThinker(BasicNewsRecipe):
	title          = u'American Thinker'
	oldest_article = 30
	max_articles_per_feed = 100

	custom_title 		= "American Thinker" 
	description      	= "American Thinker is a daily internet publication devoted to the thoughtful exploration of issues of importance to Americans.  \
			Contributors are accomplished in fields beyond journalism and animated to write for the general public out of concern for the complex \
			and morally significant questions on the national agenda.  \
			There is no limit to the topics appearing on American Thinker.  National security in all its dimensions -- strategic, economic, diplomatic, \
			and military -- is emphasized.  The right to exist and the survival of the State of Israel are of great importance to us.  Business, science, \
			technology, medicine, management, and economics in their practical and ethical dimensions are also emphasized, as is the state of American culture."
	__author__		= '_reader'
	__date__		= '05 July 2012'
	__version__		= '1.0'
	cover_url        	= 'http://www.americanthinker.com/images/at-logo.gif'
	masthead_url        	= 'http://www.americanthinker.com/images/at-logo.gif'
	language        	= 'en'
	needs_subscription	= False
	publisher		= 'americanthinker.com'
	category		= 'news, commentary'
	tags		= 'news, commentary'
	publication_type	= 'blog'
	no_stylesheets	= True
	use_embedded_content	= False
	encoding		= None
	simultaneous_downloads	= 10
	recursions		= 0
	remove_javascript	= True
	remove_empty_feeds	= True
	auto_cleanup 	= False

	conversion_options = { 'title'   : custom_title,
                           'comments'    : description,
                           'tags'             : tags,
                           'language'      : language,
                           'publisher'      : publisher,
                           'authors'        : publisher,
                           'smarten_punctuation' : True
                            }


	preprocess_regexps = [
		(re.compile(r'<table.*?</table><br><br>', re.DOTALL|re.IGNORECASE), lambda match: ''), 			
		(re.compile(r'<b>Page Printed from.*?</script>', re.DOTALL|re.IGNORECASE), lambda match: ''), 			
		(re.compile(r'<div id="article_box_ad">.*?</div>', re.DOTALL|re.IGNORECASE), lambda match: ''), 			
		]


	feeds		= [(u'Articles', u'http://feeds.feedburner.com/americanthinker_articles'), 
			(u'Blog', u'http://feeds.feedburner.com/americanthinker_blog')]

	# process the printer friendly version of article
	def print_version(self, url):
		Printurl = 'http://www.americanthinker.com/printpage/?url=' + url
		return Printurl
_reader is offline   Reply With Quote