Quote:
Originally Posted by Starson17
Yes, it's that one.
Code:
dict(name='div', attrs={'id':'vxFlashPlayer'})
will remove it.
|
Hi, back again.
Been tweaking and playing and trying to figure out why I'm still pulling up all the slide show tags in the print version.
I checked the Job Details and noticed the url.replace was not working.
This is my cleaned up (thanks to Starson17) url.replace code
Code:
def print_version(self, url):
url.replace('?OTC-RSS&ATTR=News' , '?print=yes'),
url.replace('?OTC-RSS&ATTR=Royals', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Our+Boys', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Gizmo', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Boxing', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Cricket', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Football', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Rugby+Union', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Tv', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Bizarre', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Usa', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Film', '?print=yes'),
url.replace('?OTC-RSS&ATTR=HomePage', '?print=yes')
return url
And this is part of the Job Details
Code:
Downloading
Fetching http://www.thesun.co.uk/sol/homepage/news/campaigns/our_boys/2895923/Soldiers-killed-in-Afghan-blast.html?OTC-RSS&ATTR=Our+Boys
Downloading
Fetching http://www.thesun.co.uk/sol/homepage/news/2895648/Grieving-dads-drug-warning.html?OTC-RSS&ATTR=News
Downloading
Fetching http://www.thesun.co.uk/sol/homepage/news/2895808/Soup-poison-bid-at-posh-school.html?OTC-RSS&ATTR=News
Downloading
Fetching http://www.thesun.co.uk/sol/homepage/news/campaigns/our_boys/2895647/Royal-Navy-sends-Swiftsure-class-attack-submarine-to-Falkland-Islands-to-boost-security.html?OTC-RSS&ATTR=Our+Boys
So I #'d out the original url.replace and wrote in a single one, like so.
Code:
def print_version(self, url):
return url.replace('OTC-RSS&ATTR=News', 'print=yes')
# def print_version(self, url):
# url.replace('?OTC-RSS&ATTR=News' , '?print=yes')
# url.replace('?OTC-RSS&ATTR=Royals', '?print=yes')
# url.replace('?OTC-RSS&ATTR=Gizmo', '?print=yes')
# url.replace('?OTC-RSS&ATTR=Boxing', '?print=yes')
# url.replace('?OTC-RSS&ATTR=Cricket', '?print=yes')
# url.replace('?OTC-RSS&ATTR=Football', '?print=yes')
# url.replace('?OTC-RSS&ATTR=Rugby+Union', '?print=yes')
# url.replace('?OTC-RSS&ATTR=Tv', '?print=yes')
# url.replace('?OTC-RSS&ATTR=Bizarre', '?print=yes')
# url.replace('?OTC-RSS&ATTR=Usa', '?print=yes')
# url.replace('?OTC-RSS&ATTR=Film', '?print=yes')
# url.replace('?OTC-RSS&ATTR=HomePage', '?print=yes')
# return url
And on checking the Job Details, it worked.
Code:
Downloading
Fetching http://www.thesun.co.uk/sol/homepage/news/campaigns/our_boys/2895923/Soldiers-killed-in-Afghan-blast.html?OTC-RSS&ATTR=Our+Boys
Downloading
Fetching http://www.thesun.co.uk/sol/homepage/news/2895648/Grieving-dads-drug-warning.html?print=yes
Downloading
Fetching http://www.thesun.co.uk/sol/homepage/news/2895808/Soup-poison-bid-at-posh-school.html?print=yes
Downloading
Fetching http://www.thesun.co.uk/sol/homepage/news/campaigns/our_boys/2895647/Royal-Navy-sends-Swiftsure-class-attack-submarine-to-Falkland-Islands-to-boost-security.html?OTC-RSS&ATTR=Our+Boys
I left out the '?' in the code so I thought I might as well check that.
Code:
def print_version(self, url):
url.replace('OTC-RSS&ATTR=News' , 'print=yes'),
url.replace('?OTC-RSS&ATTR=Royals', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Our+Boys', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Gizmo', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Boxing', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Cricket', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Football', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Rugby+Union', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Tv', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Bizarre', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Usa', '?print=yes'),
url.replace('?OTC-RSS&ATTR=Film', '?print=yes'),
url.replace('?OTC-RSS&ATTR=HomePage', '?print=yes')
return url
But that didn't work either.
Code:
Downloading
Fetching http://www.thesun.co.uk/sol/homepage/news/campaigns/our_boys/2895923/Soldiers-killed-in-Afghan-blast.html?OTC-RSS&ATTR=Our+Boys
Downloading
Fetching http://www.thesun.co.uk/sol/homepage/news/2895648/Grieving-dads-drug-warning.html?OTC-RSS&ATTR=News
Downloading
Fetching http://www.thesun.co.uk/sol/homepage/news/2895808/Soup-poison-bid-at-posh-school.html?OTC-RSS&ATTR=News
Downloading
Fetching http://www.thesun.co.uk/sol/homepage/news/campaigns/our_boys/2895647/Royal-Navy-sends-Swiftsure-class-attack-submarine-to-Falkland-Islands-to-boost-security.html?OTC-RSS&ATTR=Our+Boys
So can anyone point me out what I've done wrong?
Is their a way to replace
everything after
with the
no matter what it says?