View Single Post
Old 03-16-2010, 01:49 PM   #1611
Ekips
Member
Ekips began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Mar 2010
Device: PW2, K3gb(x2), K3w, K4, k5(x3) PRS-505s, Stanza for ipod
Quote:
Originally Posted by Starson17 View Post
Yes, it's that one.
Code:
dict(name='div', attrs={'id':'vxFlashPlayer'})
will remove it.
Sorted that, Also sorted the £ showing up as Ł it was
Code:
encoding= 'iso-8859-1'
Tweaked a few more bits, got the main picture to show up, ok it shows up at the end, but its there.

Does the order you put the keep tags affect the order they show up?

Spoiler:
class AdvancedUserRecipe1268409464(BasicNewsRecipe):
title = u'The Sun'
__author__ = 'Chaz Ralph'
description = 'News from The Sun'
oldest_article = 1
max_articles_per_feed = 100
no_stylesheets = True
extra_css = '.headline {font-size: x-large;} \n .fact { padding-top: 10pt }'
charset = 'iso-8859-1'
encoding= 'iso-8859-1'
remove_javascript = True

keep_only_tags = [
dict(name='div', attrs={'class':'medium-centered'})
,dict(name='div', attrs={'class':'article'})
,dict(name='div', attrs={'class':'clear-left'})
,dict(name='div', attrs={'class':'text-center'})
]

remove_tags = [dict(name='div', attrs={'class':'slideshow'})
,dict(name='div', attrs={'class':'float-left'})
,dict(name='div', attrs={'class':'ltbx-slideshow ltbx-btn-ss'})
,dict(name='a', attrs={'class':'add_a_comment'})
,dict(name='div', attrs={'id':'vxFlashPlayerContent'})
,dict(name='div', attrs={'id':'k1006094r1c1t5w380h529'})
,dict(name='div', attrs={'id':'tum_login_form_container'})
,dict(name='div', attrs={'class':'discHeader'})
,dict(name='div', attrs={'class':'margin-bottom-neg-2'})
]


feeds = [(u'News', u'http://www.thesun.co.uk/sol/homepage/feeds/rss/article312900.ece')
,(u'Sport', u'http://www.thesun.co.uk/sol/homepage/feeds/rss/article247732.ece')
,(u'Football', u'http://www.thesun.co.uk/sol/homepage/feeds/rss/article247739.ece')
,(u'Gizmo', u'http://www.thesun.co.uk/sol/homepage/feeds/rss/article247829.ece')
,(u'Bizarre', u'http://www.thesun.co.uk/sol/homepage/feeds/rss/article247767.ece')]

def print_version(self, url):
url.replace('?OTC-RSS&ATTR=News', '?print=yes')
url.replace('?OTC-RSS&ATTR=Royals', '?print=yes')
url.replace('?OTC-RSS&ATTR=Gizmo', '?print=yes')
url.replace('?OTC-RSS&ATTR=Boxing', '?print=yes')
url.replace('?OTC-RSS&ATTR=Cricket', '?print=yes')
url.replace('?OTC-RSS&ATTR=Football', '?print=yes')
url.replace('?OTC-RSS&ATTR=Rugby+Union', '?print=yes')
url.replace('?OTC-RSS&ATTR=Tv', '?print=yes')
url.replace('?OTC-RSS&ATTR=Bizarre', '?print=yes')
url.replace('?OTC-RSS&ATTR=Usa', '?print=yes')
url.replace('?OTC-RSS&ATTR=Film', '?print=yes')
url.replace('?OTC-RSS&ATTR=HomePage', '?print=yes')
return url


that's the updated recipe.

I've been playing with firebug and also installed Python 2.6 and been learning a little of that
Ekips is offline