View Single Post
Old 10-29-2011, 12:37 PM   #2
scissors
Addict
scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.
 
Posts: 241
Karma: 1001369
Join Date: Sep 2010
Device: prs300, kindle keyboard 3g
HI

I knocked this up for you. It has the first 2 feeds so you can see how to get them.

It does a little tidying (still plenty to do) but it should give you the idea.

I'm only a basic coder so that's about all i can help...

Code:
from calibre.web.feeds.news import BasicNewsRecipe
import re
class AdvancedUserRecipe1306061239(BasicNewsRecipe):
    title          = u'Isle of Mann News'
    #description = 'News as provide by The Daily Mirror -UK'

    __author__ = 'Dave Asbury'
    language = 'en_GB'

    #cover_url = 'http://yookeo.com/screens/m/i/mirror.co.uk.jpg'

    masthead_url = 'http://www.iomtoday.co.im/webimage/wwio_publication_logo_7_3824!image/3834190894.png_gen/derivatives/default/3834190894.png'


    oldest_article = 28
    max_articles_per_feed = 30
    remove_empty_feeds = True
    remove_javascript     = True
    no_stylesheets = True
    #use_embedded_content = True
    keep_only_tags = [
                              dict(name='h1'),
	          dict(attrs={'class':['article-attr','image-caption','editorialSectionLeft']})
		]    
    remove_tags = [
                           dict(name='div',attrs={'id' : 'localDirSearch'}),
                           dict(name='h2'),dict(attrs={'class' : ['sponsorPanel','localDirSearch','socialBookmarkPanel','addthis_button']})]

    preprocess_regexps = [
    (re.compile(r'h1><a href.*?/a></h1>', re.IGNORECASE | re.DOTALL), lambda match: '')]
    


    feeds          = [

        (u'News', u'http://www.iomtoday.co.im/cmlink/1.1643894'),
        (u'District News', u'http://www.iomtoday.co.im/cmlink/1.3096975')

 ]
Quote:
Originally Posted by Ptephic View Post
Hi All,

Does anyone already have a good recipe for the Isle of man news

http://www.iomtoday.co.im/news/isle-of-man-news

As a newbie I really don't understand what I am doing when I try to create a custom recipe, I really don't have any coding knowledge at all (if it is even classified as coding.)

If anyone already has a good recipe then I would be for ever grateful.

If not then i guess i have a few long nights trying to figure it all out.

Thank you very much

Jay.

Last edited by scissors; 10-29-2011 at 01:15 PM.
scissors is offline   Reply With Quote