Thread: NY Daily News
View Single Post
Old 03-20-2011, 05:35 PM   #1
muggsly
Junior Member
muggsly began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Mar 2011
Device: kindle
NY Daily News

Hello Fellow E-Readers Readers =]

i am new to the recipe creation/python biz, so i searched through examples and tried to re-use. I took two RSS feed examples from the ny daily news website as sample feeds.

Now the below "works", however I dont get full articles, I get some thing telling me that calibre DL'd the article. =/

I looked through the older posts and i see a NY Daily News post from 2009/2010, but i dont see the finalized version.

Any help for a newbie would be appreciated!

Thanks in advance

This is the item that gets displayed:
This article was downloaded by calibre from http://www.nydailynews.com/news/ny_c...ou.html?r=news


| Section Menu | Main Menu |
| Next | Section Menu | Main Menu | Previous |
This is my python:
Code:
#!/usr/bin/env  python

__license__   = 'GPL v3'
__copyright__ = '2008, Darko Miletic <darko.miletic at gmail.com>'
'''
www.nydailynews.com
'''

from calibre.web.feeds.news import BasicNewsRecipe

class NYDailyNews(BasicNewsRecipe):
    title                 = 'NY Daily News'
    __author__            = 'Steven B'
    description           = 'NY Daily News'
    language 		  = 'en'
    oldest_article         = 1
    max_articles_per_feed  = 100
    no_stylesheets         = True
    use_embedded_content   = False
    encoding               = 'utf8'
    publisher              = 'NY Daily News'
    category               = 'news'
    publication_type       = 'newsportal'
    extra_css              = ' body{ font-family: Verdana,Helvetica,Arial,sans-serif } .introduction{font-weight: bold} .story-feature{display: block; padding: 0; border: 1px solid; width: 40%; font-size: small} .story-feature h2{text-align: center; text-transform: uppercase} '
    conversion_options = {
                             'comments'        : description
                            ,'tags'            : category
                            ,'language'        : language
                            ,'publisher'       : publisher
                            ,'linearize_tables': True
                         }

    keep_only_tags    = [ dict(name='div', attrs={'id':'searchresult'}) ]
    remove_tags_after = [ dict(name='div', attrs={'id':'mainbody'    }) ]
    remove_tags       = [
                           dict(name='div'  , attrs={'id':'ads' })
                          ,dict(name='table', attrs={'width':470})
                        ]


    feeds          = [
			(u'Top Stories', u'http://www.nydailynews.com/index_rss.xml'),
			(u'News', u'http://www.nydailynews.com/news/index_rss.xml')
                     ]
muggsly is offline   Reply With Quote