Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 11-29-2010, 06:12 PM   #1
RedDogInCan
Member
RedDogInCan began at the beginning.
 
RedDogInCan's Avatar
 
Posts: 13
Karma: 10
Join Date: Nov 2010
Location: Australia
Device: Kindle DX
Recipe for Business Spectator (Australia)

Here is a recipe to collect articles from The Business Spectator, an Australian business news and commentary web site.

Code:
__license__   = 'GPL v3'
__copyright__ = '2010, Dean Cording'
'''
abc.net.au/news
'''
import re
from calibre.web.feeds.recipes import BasicNewsRecipe

class BusinessSpectator(BasicNewsRecipe):
    title                  = 'Business Spectator'
    __author__             = 'Dean Cording'
    description            = 'Australian Business News & commentary delivered the way you want it.'
    masthead_url           = 'http://www.businessspectator.com.au/bs.nsf/logo-business-spectator.gif'
    cover_url              = masthead_url

    oldest_article         = 2
    max_articles_per_feed  = 100
    no_stylesheets         = True
    #delay                  = 1
    use_embedded_content   = False
    encoding               = 'utf8'
    publisher              = 'Business Spectator'
    category               = 'News, Australia, Business'
    language               = 'en_AU'
    publication_type       = 'newsportal'
    preprocess_regexps     = [(re.compile(r'<!--.*?-->', re.DOTALL), lambda m: '')]
    conversion_options = {
                             'comments'        : description
                            ,'tags'            : category
                            ,'language'        : language
                            ,'publisher'       : publisher
                            ,'linearize_tables': False
                         }

    keep_only_tags    =  [dict(id='storyHeader'), dict(id='body-html')]

    remove_tags = [dict(attrs={'class':'hql'})]

    remove_attributes = ['width','height','style']

    feeds          = [
                      ('Top Stories', 'http://www.businessspectator.com.au/top-stories.rss'),
                      ('Alan Kohler', 'http://www.businessspectator.com.au/bs.nsf/RSS?readform&type=spectators&cat=Alan%20Kohler'),
                      ('Robert Gottliebsen', 'http://www.businessspectator.com.au/bs.nsf/RSS?readform&type=spectators&cat=Robert%20Gottliebsen'),
                      ('Stephen Bartholomeusz', 'http://www.businessspectator.com.au/bs.nsf/RSS?readform&type=spectators&cat=Stephen%20Bartholomeusz'),
                      ('Daily Dossier', 'http://www.businessspectator.com.au/bs.nsf/RSS?readform&type=kgb&cat=dossier'),
                      ('Australia', 'http://www.businessspectator.com.au/bs.nsf/RSS?readform&type=region&cat=australia'),
                    ]
RedDogInCan is offline   Reply With Quote
Old 12-01-2010, 12:34 AM   #2
RedDogInCan
Member
RedDogInCan began at the beginning.
 
RedDogInCan's Avatar
 
Posts: 13
Karma: 10
Join Date: Nov 2010
Location: Australia
Device: Kindle DX
Dang, it turns out of part of this web site is publicly viewable. The interesting commentary articles required you to register and log in first.

This script needs more work.
RedDogInCan is offline   Reply With Quote
Advert
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Recipe for ABC News (Australia) RedDogInCan Recipes 5 11-20-2011 10:16 AM
Recipe works when mocked up as Python file, fails when converted to Recipe ode Recipes 7 09-04-2011 04:57 AM
Open For Business GeoffC Self-Promotions by Authors and Publishers 8 07-10-2010 07:27 PM
For business use? patchamberlin Which one should I buy? 10 10-22-2007 09:31 AM


All times are GMT -4. The time now is 05:31 AM.


MobileRead.com is a privately owned, operated and funded community.