I am completing some modifications to my
New York Times scoop. Because the NYT archives their content after 10 days, the printer friendly links that I use in my scoop no longer work after that time. In most cases the pages scooped are still fine, but other times they are split into parts and the following pages are not downloaded (even with StoryFollowLinks turned on... not sure why). So I am trying to add a routine in URLProcess that will analyze the date of the story, present in the URL, and drop the story from the scoop if it is older than 10 days. My code works except for a function call to timelocal.pl that is not recognized. I get this error:
SITE WARNING: "NYTimes.site" line 7: URLProcess failed: Undefined subroutine
&Sitescooper::URLProcessor::timelocal called at (eval 23) line 1.
I have no perl experience and am just hacking this thing together as I go.
I cannot figure out how to resolve this. I tried adding the "require
'timelocal.pl'" command to "Main.pm" and to "StoryURLProcessor.pm", but that
didn't help either. Can someone help me? Below is the URLProcess code that
I'm using.
URLProcess: {
#Check for stories that are older than 10 days
#The printable page version won't work for these
#Currently they are just ignored
#Who needs news older than 10 days anyway?
require 'timelocal.pl';
my $url = $_;
my $y = index($url,"200");
my $year = substr($url,$y,4);
my $month = substr($url,$y+5,2);
my $day = substr($url,$y+8,2);
#my $giventime = timelocal(0,0,0,$day,$month-1,$year);
#my $currenttime = timelocal(0,0,0,(localtime) [3,4,5]);
if ((timelocal(0,0,0,(localtime) [3,4,5]) -
timelocal(0,0,0,$day,$month-1,$year)) > 10*24*3600) {
$_ = undef};
}
This code works well in testing outside of Sitescooper. I'm certain that there is a way to define the call that the program will like. Can anyone clue me in? I have also posted to the Sitescooper mailing list, but traffic there is very light of late... Thanks.