Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 11-29-2010, 06:02 PM   #1
RedDogInCan
Member
RedDogInCan began at the beginning.
 
RedDogInCan's Avatar
 
Posts: 13
Karma: 10
Join Date: Nov 2010
Location: Australia
Device: Kindle DX
Recipe for ABC News (Australia)

Here is a recipe for loading news articles from the ABC News RSS feeds http://www.abc.net.au/news/feeds/rss.htm . The ABC is the independent public broadcaster here in Australia and is considered to be relatively unbiased compared to other news sources.

The recipe currently loads articles from Top Stories, the major capital cities, Australia, World, Business, and Science and Technology feeds, but the ABC also provides other feeds on a wide range of topics and geographical areas in Australia.

One problem is that they have a habit of including video only articles in the feed and I haven't thought of a way to filter them out.

Code:
__license__   = 'GPL v3'
__copyright__ = '2010, Dean Cording'
'''
abc.net.au/news
'''
import re
from calibre.web.feeds.recipes import BasicNewsRecipe

class ABCNews(BasicNewsRecipe):
    title                  = 'ABC News'
    __author__             = 'Dean Cording'
    description            = 'News from Australia'
    masthead_url           = 'http://www.abc.net.au/news/assets/v5/images/common/logo-news.png'
    cover_url              = 'http://www.abc.net.au/news/assets/v5/images/common/logo-news.png'

    oldest_article         = 2
    max_articles_per_feed  = 100
    no_stylesheets         = False
    #delay                  = 1
    use_embedded_content   = False
    encoding               = 'utf8'
    publisher              = 'ABC News'
    category               = 'News, Australia, World'
    language               = 'en_AU'
    publication_type       = 'newsportal'
    preprocess_regexps     = [(re.compile(r'<!--.*?-->', re.DOTALL), lambda m: '')]
    conversion_options = {
                             'comments'        : description
                            ,'tags'            : category
                            ,'language'        : language
                            ,'publisher'       : publisher
                            ,'linearize_tables': False
                         }

    keep_only_tags    =  dict(id='article')

    remove_tags = [dict(attrs={'class':['related', 'tags']}),
                     dict(id='statepromo')
                        ]

    remove_attributes = ['width','height']

    feeds          = [
                      ('Top Stories', 'http://www.abc.net.au/news/syndicate/topstoriesrss.xml'),
                      ('Canberra', 'http://www.abc.net.au/news/indexes/idx-act/rss.xml'),
                      ('Sydney', 'http://www.abc.net.au/news/indexes/sydney/rss.xml'),
                      ('Melbourne', 'http://www.abc.net.au/news/indexes/melbourne/rss.xml'),
                      ('Brisbane', 'http://www.abc.net.au/news/indexes/brisbane/rss.xml'),
                      ('Perth', 'http://www.abc.net.au/news/indexes/perth/rss.xml'),
                      ('Australia', 'http://www.abc.net.au/news/indexes/idx-australia/rss.xml'),
                      ('World', 'http://www.abc.net.au/news/indexes/world/rss.xml'),
                      ('Business', 'http://www.abc.net.au/news/indexes/business/rss.xml'),
                      ('Science and Technology', 'http://www.abc.net.au/news/tag/science-and-technology/rss.xml'),
                    ]
RedDogInCan is offline   Reply With Quote
Old 10-29-2011, 03:50 AM   #2
axiiom
Junior Member
axiiom began at the beginning.
 
Posts: 2
Karma: 48
Join Date: Oct 2011
Device: kindle 3
This recipe produces empty pages containing only the section heading. Anyone else having success with this recipe? I'd really like to read ABC news on my Kindle.
axiiom is offline   Reply With Quote
Advert
Old 11-01-2011, 09:24 AM   #3
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by RedDogInCan View Post
One problem is that they have a habit of including video only articles in the feed and I haven't thought of a way to filter them out.
When I had the same problem, I wrote this code.
Starson17 is offline   Reply With Quote
Old 11-19-2011, 11:27 PM   #4
jesseb05
Junior Member
jesseb05 began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Nov 2011
Device: Amazon Kindle
Quote:
Originally Posted by axiiom View Post
This recipe produces empty pages containing only the section heading. Anyone else having success with this recipe? I'd really like to read ABC news on my Kindle.
I'm having the same issue.
jesseb05 is offline   Reply With Quote
Old 11-20-2011, 08:32 AM   #5
PatStapleton
Member
PatStapleton began at the beginning.
 
Posts: 22
Karma: 10
Join Date: Nov 2011
Location: Australia
Device: Kindle 4
I'm looking into this now. It seems the ABC have since moved their feeds. I've updated the recipe to use the new feeds, but it still needs some fixing. They must've changed some other things which has broken the recipe. I'll let you know once I get it working..


-Pat
PatStapleton is offline   Reply With Quote
Advert
Old 11-20-2011, 10:16 AM   #6
PatStapleton
Member
PatStapleton began at the beginning.
 
Posts: 22
Karma: 10
Join Date: Nov 2011
Location: Australia
Device: Kindle 4
Ok I think I have it working now. More work than I thought!

Let me know if you have any problems.

Spoiler:
Code:
__license__   = 'GPL v3'
__copyright__ = '2011, Pat Stapleton <pat.stapleton at gmail.com>'
'''
abc.net.au/news
'''
import re
from calibre.web.feeds.recipes import BasicNewsRecipe

class ABCNews(BasicNewsRecipe):
    title                  = 'ABC News'
    __author__             = 'Pat Stapleton, Dean Cording'
    description            = 'News from Australia'
    masthead_url           = 'http://www.abc.net.au/news/assets/v5/images/common/logo-news.png'
    cover_url              = 'http://www.abc.net.au/news/assets/v5/images/common/logo-news.png'

    oldest_article         = 2
    max_articles_per_feed  = 100
    no_stylesheets         = False
    #delay                  = 1
    use_embedded_content   = False
    encoding               = 'utf8'
    publisher              = 'ABC News'
    category               = 'News, Australia, World'
    language               = 'en_AU'
    publication_type       = 'newsportal'
#    preprocess_regexps     = [(re.compile(r'<!--.*?-->', re.DOTALL), lambda m: '')]
#Remove annoying map links (inline-caption class is also used for some image captions! hence regex to match maps.google)
    preprocess_regexps     = [(re.compile(r'<a class="inline-caption" href="http://maps\.google\.com.*?/a>', re.DOTALL), lambda m: '')]
    conversion_options = {
                             'comments'        : description
                            ,'tags'            : category
                            ,'language'        : language
                            ,'publisher'       : publisher
                            ,'linearize_tables': False
                         }

    keep_only_tags = [dict(attrs={'class':['article section']})]

    remove_tags = [dict(attrs={'class':['related', 'tags', 'tools', 'attached-content ready',
        'inline-content story left', 'inline-content map left contracted', 'published',
        'story-map', 'statepromo', 'topics', ]})]

    remove_attributes = ['width','height']

    feeds          = [
                      ('Top Stories', 'http://www.abc.net.au/news/feed/45910/rss.xml'),
                      ('Canberra', 'http://www.abc.net.au/news/feed/6910/rss.xml'),
                      ('Sydney', 'http://www.abc.net.au/news/feed/10232/rss.xml'),
                      ('Melbourne', 'http://www.abc.net.au/news/feed/21708/rss.xml'),
                      ('Brisbane', 'http://www.abc.net.au/news/feed/12858/rss.xml'),
                      ('Perth', 'feed://www.abc.net.au/news/feed/24886/rss.xml'),
                      ('Australia', 'http://www.abc.net.au/news/feed/46182/rss.xml'),
                      ('World', 'http://www.abc.net.au/news/feed/52278/rss.xml'),
                      ('Business', 'http://www.abc.net.au/news/feed/51892/rss.xml'),
                      ('Science and Technology', 'http://www.abc.net.au/news/feed/2298/rss.xml'),
                    ]


Enjoy!

-Pat
PatStapleton is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Help with news recipe Acey Calibre 2 03-12-2010 06:36 AM
News recipe sorting OzAz Calibre 3 10-30-2009 06:28 PM
Question on TheAtlantic News Recipe gilamon Calibre 6 11-05-2008 03:07 PM
The Times news recipe? AprilHare Calibre 1 10-10-2008 01:48 PM
PRS-505 reviews: CNET (7/10), ABC News TadW Sony Reader 0 11-15-2007 10:59 AM


All times are GMT -4. The time now is 12:03 AM.


MobileRead.com is a privately owned, operated and funded community.