07-28-2012, 08:55 PM | #1 |
Connoisseur
Posts: 55
Karma: 13316
Join Date: Jul 2012
Device: iPad
|
Philosophy Now Recipe
I'm very honored to find out that Kovid Goyal quickly included my previous recipes (The New Republic, Psychology Today and Smithsonian) in Calibre. So I am uploading another one. This one requires subscription. Hopefully someone finds this helpful.
Code:
import re from calibre.web.feeds.recipes import BasicNewsRecipe from collections import OrderedDict class PhilosophyNow(BasicNewsRecipe): title = 'Philosophy Now' __author__ = 'Rick Shang' # description = 'Philosophy Now is a lively magazine for everyone interested in ideas. It isn't afraid to tackle all the major questions of life, the universe and everything. Published every two months, it tries to corrupt innocent citizens by convincing them that philosophy can be exciting, worthwhile and comprehensible, and also to provide some enjoyable reading matter for those already ensnared by the muse, such as philosophy students and academics.' language = 'en' category = 'news' encoding = 'UTF-8' keep_only_tags = [dict(attrs={'id':'fullMainColumn'})] remove_tags = [dict(attrs={'class':'articleTools'})] no_javascript = True no_stylesheets = True needs_subscription = True def get_browser(self): br = BasicNewsRecipe.get_browser() br.open('https://philosophynow.org/auth/login') br.select_form(nr = 1) br['username'] = self.username br['password'] = self.password br.submit() return br def parse_index(self): #Go to the issue soup0 = self.index_to_soup('http://philosophynow.org/') issue = soup0.find('div',attrs={'id':'navColumn'}) #Find date & cover cover = issue.find('div', attrs={'id':'cover'}) date = self.tag_to_string(cover.find('h3')).strip() self.timefmt = u' [%s]'%date img=cover.find('img',src=True)['src'] self.cover_url = 'http://philosophynow.org' + re.sub('medium','large',img) issuenum = re.sub('/media/images/covers/medium/issue','',img) issuenum = re.sub('.jpg','',issuenum) #Go to the main body current_issue_url = 'http://philosophynow.org/issues/' + issuenum soup = self.index_to_soup(current_issue_url) div = soup.find ('div', attrs={'class':'articlesColumn'}) feeds = OrderedDict() for post in div.findAll('h3'): articles = [] a=post.find('a',href=True) if a is not None: url="http://philosophynow.org" + a['href'] title=self.tag_to_string(a).strip() s=post.findPrevious('h4') section_title = self.tag_to_string(s).strip() d=post.findNext('p') desc = self.tag_to_string(d).strip() articles.append({'title':title, 'url':url, 'description':desc, 'date':''}) if articles: if section_title not in feeds: feeds[section_title] = [] feeds[section_title] += articles ans = [(key, val) for key, val in feeds.iteritems()] return ans |
01-10-2013, 10:46 PM | #2 |
Junior Member
Posts: 1
Karma: 10
Join Date: Jan 2013
Device: Kindle DX
|
Philosophy Now recipe not working...
Hi, Can't seem to get this recipe working (I have an active digital subscription to Philosophy Now and can log on to the website okay).
I've entered the details however get the below - any ideas? ---------------- calibre, version 0.9.13 (win32, isfrozen: True) Conversion Error: Failed: Fetch news from Philosophy Now Fetch news from Philosophy Now Resolved conversion options calibre version: 0.9.13 {'asciiize': False, 'author_sort': None, 'authors': None, 'base_font_size': 0, 'book_producer': None, 'change_justification': 'original', 'chapter': None, 'chapter_mark': 'pagebreak', 'comments': None, 'cover': None, 'debug_pipeline': None, 'dehyphenate': True, 'delete_blank_paragraphs': True, 'disable_font_rescaling': False, 'dont_compress': False, 'dont_download_recipe': False, 'duplicate_links_in_toc': False, 'embed_font_family': None, 'enable_heuristics': False, 'extra_css': None, 'extract_to': None, 'filter_css': None, 'fix_indents': True, 'font_size_mapping': None, 'format_scene_breaks': True, 'html_unwrap_factor': 0.4, 'input_encoding': None, 'input_profile': <calibre.customize.profiles.InputProfile object at 0x000000000499E278>, 'insert_blank_line': False, 'insert_blank_line_size': 0.5, 'insert_metadata': False, 'isbn': None, 'italicize_common_cases': True, 'keep_ligatures': False, 'language': None, 'level1_toc': None, 'level2_toc': None, 'level3_toc': None, 'line_height': 0, 'linearize_tables': False, 'lrf': False, 'margin_bottom': 5.0, 'margin_left': 5.0, 'margin_right': 5.0, 'margin_top': 5.0, 'markup_chapter_headings': True, 'max_toc_links': 50, 'minimum_line_height': 120.0, 'mobi_file_type': 'old', 'mobi_ignore_margins': False, 'mobi_keep_original_images': False, 'mobi_toc_at_start': False, 'no_chapters_in_toc': False, 'no_inline_navbars': True, 'no_inline_toc': False, 'output_profile': <calibre.customize.profiles.KindleDXOutput object at 0x000000000499E860>, 'page_breaks_before': None, 'personal_doc': '[PDOC]', 'prefer_author_sort': False, 'prefer_metadata_cover': False, 'pretty_print': False, 'pubdate': None, 'publisher': None, 'rating': None, 'read_metadata_from_opf': None, 'remove_fake_margins': True, 'remove_first_image': False, 'remove_paragraph_spacing': False, 'remove_paragraph_spacing_indent_size': 1.5, 'renumber_headings': True, 'replace_scene_breaks': '', 'search_replace': None, 'series': None, 'series_index': None, 'share_not_sync': False, 'smarten_punctuation': False, 'sr1_replace': '', 'sr1_search': '', 'sr2_replace': '', 'sr2_search': '', 'sr3_replace': '', 'sr3_search': '', 'start_reading_at': None, 'subset_embedded_fonts': False, 'tags': None, 'test': False, 'timestamp': None, 'title': None, 'title_sort': None, 'toc_filter': None, 'toc_threshold': 6, 'toc_title': None, 'unsmarten_punctuation': False, 'unwrap_lines': True, 'use_auto_toc': False, 'verbose': 2} InputFormatPlugin: Recipe Input running Using custom recipe Python function terminated unexpectedly no control matching name 'username' (Error Code: 1) Traceback (most recent call last): File "site.py", line 132, in main File "site.py", line 109, in run_entry_point File "site-packages\calibre\utils\ipc\worker.py", line 186, in main File "site-packages\calibre\gui2\convert\gui_conversion.py", line 25, in gui_convert File "site-packages\calibre\ebooks\conversion\plumber.py", line 1009, in run File "site-packages\calibre\customize\conversion.py", line 239, in __call__ File "site-packages\calibre\ebooks\conversion\plugins\recipe_ input.py", line 108, in convert File "site-packages\calibre\web\feeds\news.py", line 787, in __init__ File "<string>", line 31, in get_browser File "site-packages\mechanize-0.2.5-py2.7.egg\mechanize\_form.py", line 2780, in __setitem__ File "site-packages\mechanize-0.2.5-py2.7.egg\mechanize\_form.py", line 3101, in find_control File "site-packages\mechanize-0.2.5-py2.7.egg\mechanize\_form.py", line 3185, in _find_control mechanize._form.ControlNotFoundError: no control matching name 'username' |
Advert | |
|
01-11-2013, 12:52 AM | #3 |
Connoisseur
Posts: 55
Karma: 13316
Join Date: Jul 2012
Device: iPad
|
Thanks for informing me
This is an update, which should fix all problems Code:
import re from calibre.web.feeds.recipes import BasicNewsRecipe from collections import OrderedDict class PhilosophyNow(BasicNewsRecipe): title = 'Philosophy Now' __author__ = 'Rick Shang' # description = 'Philosophy Now is a lively magazine for everyone interested in ideas. It isn't afraid to tackle all the major questions of life, the universe and everything. Published every two months, it tries to corrupt innocent citizens by convincing them that philosophy can be exciting, worthwhile and comprehensible, and also to provide some enjoyable reading matter for those already ensnared by the muse, such as philosophy students and academics.' language = 'en' category = 'news' encoding = 'UTF-8' keep_only_tags = [dict(attrs={'id':'fullMainColumn'})] remove_tags = [dict(attrs={'class':'articleTools'})] no_javascript = True no_stylesheets = True needs_subscription = True def get_browser(self): br = BasicNewsRecipe.get_browser() br.open('https://philosophynow.org/auth/login') br.select_form(name="loginForm") br['username'] = self.username br['password'] = self.password br.submit() return br def parse_index(self): #Go to the issue soup0 = self.index_to_soup('http://philosophynow.org/') issue = soup0.find('div',attrs={'id':'navColumn'}) #Find date & cover cover = issue.find('div', attrs={'id':'cover'}) date = self.tag_to_string(cover.find('h3')).strip() self.timefmt = u' [%s]'%date img=cover.find('img',src=True)['src'] self.cover_url = 'http://philosophynow.org' + re.sub('medium','large',img) issuenum = re.sub('/media/images/covers/medium/issue','',img) issuenum = re.sub('.jpg','',issuenum) #Go to the main body current_issue_url = 'http://philosophynow.org/issues/' + issuenum soup = self.index_to_soup(current_issue_url) div = soup.find ('div', attrs={'class':'contentsColumn'}) feeds = OrderedDict() for post in div.findAll('h1'): articles = [] a=post.find('a',href=True) if a is not None: url="http://philosophynow.org" + a['href'] title=self.tag_to_string(a).strip() s=post.findPrevious('h3') section_title = self.tag_to_string(s).strip() d=post.findNext('h2') desc = self.tag_to_string(d).strip() articles.append({'title':title, 'url':url, 'description':desc, 'date':''}) if articles: if section_title not in feeds: feeds[section_title] = [] feeds[section_title] += articles ans = [(key, val) for key, val in feeds.iteritems()] return ans def cleanup(self): self.browser.open('http://philosophynow.org/auth/logout') |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Seriously thoughtful Philosophy Podcasts | dadioflex | Lounge | 10 | 01-20-2013 09:08 AM |
Ed's Philosophy On Dog Training | kayus4321 | Self-Promotions by Authors and Publishers | 0 | 10-24-2011 10:24 AM |
Seeking philosophy ebook | Lars_G | Reading Recommendations | 8 | 02-26-2011 05:11 AM |
Philosophy eBooks | dhume01 | Deals and Resources (No Self-Promotion or Affiliate Links) | 8 | 07-28-2010 12:18 PM |
Stanford Encyclopedia of Philosophy | FlorenceArt | Deals and Resources (No Self-Promotion or Affiliate Links) | 6 | 08-29-2009 07:43 PM |