02-22-2016, 10:03 PM | #1 |
Zealot
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
|
Today (Singapore)
The existing recipe for Today (Singapore) newspaper is no longer working. Here is the one I have fixed:
Code:
from calibre.ptempfile import PersistentTemporaryFile from calibre.web.feeds.news import BasicNewsRecipe class AdvancedUserRecipe1276486274(BasicNewsRecipe): title = u'Today Online - Singapore' publisher = 'MediaCorp Press Ltd - Singapore' __author__ = 'rty' category = 'news, Singapore' oldest_article = 7 max_articles_per_feed = 100 remove_javascript = True use_embedded_content = False no_stylesheets = True language = 'en_SG' temp_files = [] articles_are_obfuscated = True masthead_url = 'http://www.todayonline.com/sites/all/themes/today/logo.png' conversion_options = {'linearize_tables':True} extra_css = ''' .author{font-style: italic; font-size: small} .date{font-style: italic; font-size: small} .Headline{font-weight: bold; font-size: xx-large} .headerStrap{font-weight: bold; font-size: x-large; font-syle: italic} .bodyText{font-size: 4px;font-family: Times New Roman;} ''' feeds = [ (u'Hot News', u'http://www.todayonline.com/hot-news/feed'), (u'Singapore', u'http://www.todayonline.com/feed/singapore'), (u'World', u'http://www.todayonline.com/feed/world'), (u'Business', u'http://www.todayonline.com/feed/business'), (u'Tech', u'http://www.todayonline.com/feed/tech'), (u'Voices', u'http://www.todayonline.com/feed/voices'), (u'Commentary', u'http://www.todayonline.com/feed/Commentary'), (u'Daily Focus', u'http://www.todayonline.com/feed/daily-focus'), (u'Lifestyle', u'http://www.todayonline.com/feed/lifestyle'), ] keep_only_tags = [ dict(name='div', attrs='print-content') ] remove_tags = [ dict(name='div', attrs={'class':['url','button']}), dict(name='div', attrs={'class':'node-type-print-edition'}), dict(name='div', attrs={'class':['field field-name-field-article-section field-type-taxonomy-term-reference field-label-hidden','field field-name-field-article-abstract field-type-text-long field-label-hidden','authoring']}) ] def get_obfuscated_article(self, url): br = self.get_browser() br.open(url) response = br.follow_link(url_regex = r'/print/', nr = 0) html = response.read() self.temp_files.append(PersistentTemporaryFile('_fa.html')) self.temp_files[-1].write(html) self.temp_files[-1].close() return self.temp_files[-1].name def preprocess_html(self, soup): for item in soup.findAll(style=True): del item['style'] return soup Last edited by rty; 02-23-2016 at 01:26 AM. |
02-23-2016, 01:23 AM | #2 |
Zealot
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
|
I also notice that the recipe for Asiaone, another news portal in Singapore, is no longer working properly. I don't know if I should create a new thread for it, but here is the recipe I have already fixed:
Code:
#!/usr/bin/env python2 __license__ = 'GPL v3' __copyright__ = '2009, Bruce <bruce at dotdoh.com>' ''' asiaone.com ''' from calibre.web.feeds.news import BasicNewsRecipe class AsiaOne(BasicNewsRecipe): title = u'AsiaOne' oldest_article = 2 max_articles_per_feed = 100 __author__ = 'Bruce' description = 'News from Singapore Press Holdings Portal' no_stylesheets = False language = 'en_SG' remove_javascript = True remove_tags = [dict(name='span', attrs={'class':'footer'})] keep_only_tags = [ dict(name='h1', attrs={'class':'headline'}), dict(name='div', attrs={'class':['article-content','person-info row']}) ] feeds = [ ('Singapore', 'http://asiaone.feedsportal.com/c/34151/f/618415/index.rss'), ('Asia', 'http://asiaone.feedsportal.com/c/34151/f/618416/index.rss') ] Last edited by rty; 02-23-2016 at 01:25 AM. |
Advert | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
PRS-900 hi! i'm from Singapore, is this product possible to use in Singapore? | nelson7lim | Sony Reader | 20 | 07-03-2010 11:08 AM |
Hi all. i'm from singapore and need some help! | nelson7lim | Introduce Yourself | 11 | 06-03-2010 05:28 AM |
Hello from Singapore | poverello | Introduce Yourself | 8 | 07-20-2009 08:27 AM |
Hi from Singapore | loc_141 | Introduce Yourself | 4 | 04-02-2009 12:15 PM |
Hi! From Singapore | peepee | Introduce Yourself | 1 | 08-13-2007 04:18 PM |