Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > Miscellaneous > Archive > Sitescooper

Notices

 
 
Thread Tools Search this Thread
Old 08-02-2004, 08:14 AM   #1
geoffreynz
Member
geoffreynz began at the beginning.
 
Posts: 17
Karma: 44
Join Date: Jul 2004
Device: Palm m515
Thumbs up Lots of new German, French and English scoops

(Also posted to the sitescooper-talk mailing list, http://sitescooper.org/pipermail/sitescooper-talk/).



Here are a heap of scoops I have made for Sitescooper since November 2003. As of August 2, 2004, they are working correctly. If anyone can improve on them, feel free to do so and post the results, there are certainly some rough edges here and there. Thanks to anyone along the way who has helped me with problems, notably Kennis Koldewyn and Goh Boon Nam. Sorry if there's anyone else in particular whom I've forgotten. Disclaimer: no responsiblity or liability accepted for what happens when you use these .site files!

Enjoy! My personal favourites are marked with an asterisk (*). If you want to discuss anything, my email address is geoffreynz at that funny hilarious yahooing address at a dot com domain.

INDEX:

- New Zealand Scoops

NZ Herald x7
Newstalk ZB
ConcertFM
National Radio
NZ Listener
Sunday Star Times

- French Scoops
AFP/AP x2*
Le Monde*
L'express
Liberation x2

- German Scoops
AP Deutsch***
Berliner Zeitung x3
Currymafia.de*
German Words of the Day
n-tv Politik
n-tv Programm x7
Reuters Deutsch
Tagesspiegel x3

- English Scoops
BusinessWeek Daily Briefing
Newsweek International
NYT x6
USA Today x4


New Zealand scoops

URL: http://www.nzherald.co.nz/storyarchi...ection=general
Name: NZ Herald
Description: New Zealand Herald Friday
Levels: 2
ContentsStart: <span class="headlinessmall">
ContentsEnd: Thursday,
StoryURL: http://www.nzherald.co.nz/.*
StoryEnd: <i>- ADVERTISEMENT -</i>
StoryStart: <td valign="top" width="325">
# ImageURL: .*\.jpg
# ImageURL: .*\.JPG
SizeLimit: 2000

URL: http://www.nzherald.co.nz/storyarchi...ection=general
Name: NZ Herald
Description: New Zealand Herald Monday
Levels: 2
ContentsStart: <span class="headlinessmall">
ContentsEnd: Sunday,
StoryURL: http://www.nzherald.co.nz/.*
StoryEnd: <i>- ADVERTISEMENT -</i>
StoryStart: <td valign="top" width="325">
# ImageURL: .*\.jpg
# ImageURL: .*\.JPG
SizeLimit: 2000

URL: http://www.nzherald.co.nz/storyarchi...ection=general
Name: Weekend Herald
Description: Weekend Herald
Levels: 2
ContentsStart: <span class="headlinessmall">
ContentsEnd: Friday,
StoryURL: http://www.nzherald.co.nz/.*
StoryStart: <td valign="top" width="325">
StoryEnd: <i>- ADVERTISEMENT -</i>
ImageURL: .*\.jpg
ImageURL: .*\.JPG
SizeLimit: 2000

URL: http://www.nzherald.co.nz/storyarchi...ection=general
Name: NZ Herald
Description: New Zealand Herald Thursday
Levels: 2
ContentsStart: <span class="headlinessmall">
ContentsEnd: Wednesday,
StoryURL: http://www.nzherald.co.nz/.*
StoryStart: <td valign="top" width="325">
# ImageURL: .*\.jpg
StoryEnd: <i>- ADVERTISEMENT -</i>
# ImageURL: .*\.JPG
SizeLimit: 2000

URL: http://www.nzherald.co.nz/storyarchi...ection=general
Name: NZ Herald
Description: New Zealand Herald Tuesday
Levels: 2
ContentsStart: <span class="headlinessmall">
ContentsEnd: Monday,
StoryURL: http://www.nzherald.co.nz/.*
StoryEnd: <i>- ADVERTISEMENT -</i>
StoryStart: <td valign="top" width="325">
# ImageURL: .*\.jpg
# ImageURL: .*\.JPG
SizeLimit: 2000

URL: http://www.nzherald.co.nz/storyarchi...ection=general
Name: NZ Herald
Description: New Zealand Herald Wednesday
Levels: 2
ContentsStart: <span class="headlinessmall">
ContentsEnd: Tuesday,
StoryURL: http://www.nzherald.co.nz/.*
StoryEnd: <i>- ADVERTISEMENT -</i>
StoryStart: <td valign="top" width="325">
# ImageURL: .*\.jpg
# ImageURL: .*\.JPG
SizeLimit: 2000

# Change the two Xs in the URL to today's date, without leading 0s, e.g. m=7 d=29
URL: http://www.telstraclear.co.nz/newsfe...y=2004&m=X&d=X
Name: NewstalkZB
Levels: 2
ContentsStart: <TD VALIGN=TOP>
ContentsEnd: <TD WIDTH=155 VALIGN=TOP align=center>
StoryURL: http://www.telstraclear.co.nz/newsfeed/.*
StoryStart: <TD VALIGN=TOP>
StoryEnd: <td width=100% align="left">
StoryHTMLPreProcess: {
s,Â,,gis;
}
ContentsHTMLPreProcess: {
s,Â,,gis;
}

URL: http://www.radionz.co.nz/index.php?n...ion=c_schedule
Name: ConcertFM
Description: NZ Concert FM schedule
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <!-- start content -->
ContentsEnd: <!-- end content -->
StoryStart: <!-- start content -->
StoryEnd: <!-- end content -->
StoryURL: http://www.radionz.co.nz/.*

URL: http://www.radionz.co.nz/index.php?section=schedule
Name: NationalRadio
Description: National Radio schedule
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <!-- start content -->
ContentsEnd: <!-- end content -->
StoryStart: <!-- start content -->
StoryEnd: <!-- end content -->
StoryURL: http://www.radionz.co.nz/.*
storySkipURL: http://www.radionz.co.nz/index.php?i...ction=schedule
StorySkipURL: http://www.radionz.co.nz/index.php?i...ction=schedule

URL: http://www.listener.co.nz
Name: NZ Listener
AuthorName: Geoffrey Miller geoffreynz /atsymbol/ yahoo dot com
Description: 'The New Zealand Listener is the country's only national, weekly current affairs and entertainment magazine.'
Levels: 2
ContentsStart: <!-- CENTRE COLUMN -->
ContentsEnd: <!-- END CENTRE COLUMN -->
StoryURL: http://www.listener.co.nz/default,\d+\.sm
StoryURL: http://www.listener.co.nz/default,\d+,\d+,\d\.sm
StoryStart: <!-- CENTRE COLUMN -->
StoryEnd: <!-- END CENTRE COLUMN -->
StoryFollowLinks: 1


URL: http://www.stuff.co.nz/stuff/sundays...0a6445,00.html
Name: SST Business
Description: New Zealand Sunday Star Times - Business
Levels: 2
ContentsStart: <br clear="left">
ContentsEnd: <br clear="right">
StoryURL: http://www.stuff.co.nz/stuff/sundaystartimes/.*\.html
StoryStart: <br clear="left">
StoryEnd: <br><br>

URL: http://www.stuff.co.nz/stuff/sundays...a11155,00.html
Name: SST Escape
Description: New Zealand Sunday Star Times - Escape
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <br clear="left">
ContentsEnd: <br clear="right">
StoryURL: http://www.stuff.co.nz/stuff/sundaystartimes/.*\.html
StoryStart: <br clear="left">
StoryEnd: <br><br>
SizeLimit: 200

URL: http://www.stuff.co.nz/stuff/sundays...0a6619,00.html
Name: SST Focus
Description: New Zealand Sunday Star Times - Focus
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <br clear="left">
ContentsEnd: <br clear="right">
StoryURL: http://www.stuff.co.nz/stuff/sundaystartimes/.*\.html
StoryStart: <br clear="left">
StoryEnd: <br><br>
SizeLimit: 200

URL: http://www.stuff.co.nz/stuff/sundays...0a6444,00.html
Name: SST Sport
Description: New Zealand Sunday Star Times - Sport
Levels: 2
ContentsStart: <br clear="left">
ContentsEnd: <br clear="right">
StoryURL: http://www.stuff.co.nz/stuff/sundaystartimes/.*\.html
StoryStart: <br clear="left">
StoryEnd: <br><br>

URL: http://www.stuff.co.nz/stuff/0,2106,0a6442,00.html
Name: SST News
Description: New Zealand Sunday Star Times - News
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <br clear="left">
ContentsEnd: <br clear="right">
StoryURL: http://www.stuff.co.nz/stuff/sundaystartimes/.*\.html
StoryStart: <br clear="left">
StoryEnd: <br><br>

French scoops

URL: http://fr.news.yahoo.com/121/
Name: AFP/AP France
Description: AFP/AP France
Levels: 2
ContentsStart: <table border=0 width=100% cellpadding=0 cellspacing=0><tr><td valign=top>
ContentsEnd: <table border=0 cellpadding=2 cellspacing=0><tr><td>
StoryURL: http://fr.news.yahoo.com/.*\.html
StoryStart: </TABLE></TD></TR></TABLE></td></tr></table>
StoryEnd: <table border=0 cellpadding=2 cellspacing=0><tr>
ImageURL: http://eur.news1.yimg.com/eur.yimg.com/.*\.jpg
StoryHTMLPreProcess: {
s,..x0153;,œ,gis;
s,..x2026;,...,gis;
s,<A HREF=".*">agrandir la photo</A>,,gis;
}
ContentsHTMLPreProcess: {
s,..x0153;,œ,gis;
s,..x2026;,...,gis;
# s,<A HREF=".*"><IMG alt=Photo SRC=.*.jpg.*</A>,,gis;
}
# ContentsUseTableSmarts: 0

URL: http://fr.news.yahoo.com/2/
Name: AFP/AP Monde
Description: AFP/AP Monde
Levels: 2
ContentsStart: <table border=0 width=100% cellpadding=0 cellspacing=0><tr><td valign=top>
ContentsEnd: <table border=0 cellpadding=2 cellspacing=0><tr><td>
StoryURL: http://fr.news.yahoo.com/.*\.html
StoryStart: </TABLE></TD></TR></TABLE></td></tr></table>
StoryEnd: <table border=0 cellpadding=2 cellspacing=0><tr>
ImageURL: http://eur.news1.yimg.com/eur.yimg.com/.*\.jpg
StoryHTMLPreProcess: {
s,..x0153;,œ,gis;
s,..x2026;,...,gis;
s,<A HREF=".*">agrandir la photo</A>,,gis;
}
ContentsHTMLPreProcess: {
s,..x0153;,œ,gis;
s,..x2026;,...,gis;
}

URL: http://www.lemonde.fr/txt/sequence/0,2-3208,1-0,0.html
Name: Le Monde
Description: Le Monde
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <!-- gab-pv-sujet_home_tete_manchette.php -->
ContentsEnd: <td width="8" valign="top" align=center>
StoryURL: http://www.lemonde.fr/txt/article/.*\.html
StoryStart: <!-- /gab-pv-pub_OAS_468.php -->
StoryEnd: <!-- gab-pv-liens_publicitaires.php -->
SizeLimit: 5000
# StoryHTMLPreProcess: {
# s,<a href=http://www.lemonde.fr/web/imprimer.*Classer</a><.,,gis;
# s,.*Recherche du tag de pub.*>,,gis;
# }

URL: http://www.lexpress.fr/info/france/
Name: L'express France
Description: L'express France
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: France</td>
ContentsEnd: LEXPRESS interactif
StoryURL: http://www.lexpress.fr/.*\.asp.*
StoryStart: <span class="PubDate">
StoryEnd: <a href="#Top" class="mini-liens">
ContentsUseTableSmarts: 0
StoryHTMLPreProcess: {
s,..x0153;,œ,gis;
s,..x2026;,...,gis;
}
ContentsHTMLPreProcess: {
s,..x0153;,œ,gis;
s,..x2026;,...,gis;
}

URL: http://www.lexpress.fr/info/monde/
Name: L'express Monde
Description: L'express Monde
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: Monde</td>
ContentsEnd: LEXPRESS interactif
StoryURL: http://www.lexpress.fr/.*\.asp.*
StoryStart: <span class="PubDate">
StoryEnd: <a href="#Top" class="mini-liens">
ContentsUseTableSmarts: 0
StoryHTMLPreProcess: {
s,..x0153;,œ,gis;
s,..x2026;,...,gis;
}
ContentsHTMLPreProcess: {
s,..x0153;,œ,gis;
s,..x2026;,...,gis;
}

URL: http://www.liberation.fr/page.php?Rubrique=EVENEMENT
Name: Libération Événements
Description: Libération Événements
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <!-- col centrale appels haut -->
ContentsEnd: <!-- fin col centrale appels haut -->
StoryURL: http://www.liberation.fr/page.php.*
StoryStart: <span class=actu-tit2>
StoryEnd: <span class="art-postxt">

URL: http://www.liberation.fr/page.php?Rubrique=MONDE
Name: Libération Monde
Description: Libération Monde
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <!-- col centrale appels haut -->
ContentsEnd: <!-- fin col centrale appels haut -->
StoryURL: http://www.liberation.fr/page.php.*
StoryStart: <span class=actu-tit2>
StoryEnd: <span class="art-postxt">


German Scoops

URL: http://www.mittelhessen.de/ap/apnews_overview.php
Name: AP Deutsch
Description: AP Nachrichten auf Deutsch
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <table width="750" border="0" cellspacing="0" cellpadding="0">
ContentsEnd: <!-- IVW VERSION="1.2" -->
StoryURL: http://www.mittelhessen.de/ap/apnews.php.*
StoryStart: <table width="750" border="0" cellspacing="0" cellpadding="0">
StoryEnd: <!-- IVW VERSION="1.2" -->
ContentsUseTableSmarts: 0
SizeLimit: 5000

URL: http://www.berlinonline.de/berliner-...zin/index.html
Name: BZ Magazin
Description: Berliner Zeitung Magazin
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <a name="beginContent" class="invisible">
ContentsEnd: <!--artikel ende-->
StoryURL: http://www.berlinonline.de/berliner-...azin/\d+\.html
StoryStart: <a name="beginContent" class="invisible">
StoryEnd: <!--artikel ende-->
StoryHTMLPreProcess: {
s,<div style="align:right.*?</div>,,gis;
s,<h3>/*?</h3>,<b>.*?</b>,gis;
}

URL: http://www.berlinonline.de/berliner-...tik/index.html
Name: BZ Politik
Description: Berliner Zeitung Politik
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <a name="beginContent" class="invisible">
ContentsEnd: <!--artikel ende-->
StoryURL: http://www.berlinonline.de/berliner-...itik/\d+\.html
StoryStart: <a name="beginContent" class="invisible">
StoryEnd: <!--artikel ende-->
StoryHTMLPreProcess: {
s,<div style="align:right.*?</div>,,gis;
s,<h3>/*?</h3>,<b>.*?</b>,gis;
}

URL: http://www.berlinonline.de/berliner-...ise/index.html
Name: BZ Reise
Description: Berliner Zeitung Reise
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <a name="beginContent" class="invisible">
ContentsEnd: <!--artikel ende-->
StoryURL: http://www.berlinonline.de/berliner-...eise/\d+\.html
StoryStart: <a name="beginContent" class="invisible">
StoryEnd: <!--artikel ende-->
StoryHTMLPreProcess: {
s,<div style="align:right.*?</div>,,gis;
s,<h3>/*?</h3>,<b>.*?</b>,gis;
}

URL: http://www.currymafia.de/
Name: Currymafia.de
Description: Currymafia.de Blog
AuthorName: Geoffrey Miller
Levels: 1
ContentsStart: <!-- INSERT BLOGGER CODE HERE -->
ContentsEnd: <!-- END BLOGGER CODE -->

URL: http://german.about.com/library/blworttag.htm
Name: German WD
Description: german.about.com's German Words of the Day
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <b>1. Woche</b></font>
ContentsEnd: <b>Related</b>
StoryStart: <font face="verdana, geneva, arial" color="#666666" size="-1">
StoryEnd: BACK &gt; <a href="../blworttag.htm">
StoryEnd: German Chat
StoryURL: http://german.about.com/library/definitions/.*\.htm

URL: http://www.ikz-online.de/ikz/ikz.rei...ueberblick.php
Levels: 2
Name: gms Reise
ContentsStart: header.berichte2.gif
ContentsEnd: <!-- Ende - Z_2sp_dpa_Uebers_Fortl_SQL -->
StoryURL: http://www.ikz-online.de/.*
StoryStart: <!-- Ende - Z_2sp_Multicom_Lang_SQL -->
StoryEnd: <span class="contentfliess">
ImageURL: http://www.ikz-online.de/includes/bi....php?.*.nitf.*

URL: http://mobil.n-tv.de/pocketpc/110.html
Name: n-tv Politik
Description: n-tv Nachrichtensender Politik
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <font color="#00309C">
ContentsEnd: <tr><td><b>&nbsp;»&nbsp;</b>
StoryStart: <font color="#00309C">
ImageURL: .*\.jpg.*
StoryEnd: <tr><td><b>&nbsp;»&nbsp;</b>
StoryURL: http://mobil.n-tv.de/pocketpc/.*\.html

URL: http://www.n-tv.de/701857.html?day=4
Name: n-tv Fri
ContentsStart: " class="boxThema">
ContentsEnd: <td width="165" valign="top">

URL: http://www.n-tv.de/701857.html?day=0
Name: n-tv Mon
ContentsStart: " class="boxThema">
ContentsEnd: <td width="165" valign="top">
ContentsHTMLPreProcess: {
s,<a href\="(.*\.html)" class="appLink">,,gis;
}

URL: http://www.n-tv.de/701857.html?day=5
Name: n-tv Sat
ContentsStart: " class="boxThema">
ContentsEnd: <td width="165" valign="top">

URL: http://www.n-tv.de/701857.html?day=6
Name: n-tv Sun
ContentsStart: " class="boxThema">
ContentsEnd: <td width="165" valign="top">

URL: http://www.n-tv.de/701857.html?day=3
Name: n-tv Thurs
ContentsStart: " class="boxThema">
ContentsEnd: <td width="165" valign="top">

URL: http://www.n-tv.de/701857.html?day=1
Name: n-tv Tue
ContentsStart: " class="boxThema">
ContentsEnd: <td width="165" valign="top">

URL: http://www.n-tv.de/701857.html?day=2
Name: n-tv Wed
ContentsStart: " class="boxThema">
ContentsEnd: <td width="165" valign="top">

URL: http://www.reuters.de/newsAdditional...e&section=news
Name: Reuters Deutsch
Description: Reuters Nachrichten auf Deutsch
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <tr><td colspan="2" class="v11BlackBold" bgcolor="#F4D592">
ContentsEnd: <td valign="top" width="6">
StoryStart: <span class="v18BlackBold">
StoryEnd: <td colspan="4" height="5">
ImageURL: http://wwwi.reuters.com/images/.*\.jpg
ContentsHTMLPreProcess: {
s,ä,ä,gis;
s,ü,ü,gis;
s,ß,ß,gis;
s,ö,ö,gis;
s,Ãœ,Ü,gis;
s,Ö,Ö,gis;
s,Ä,Ä,gis;
}
StoryHTMLPreProcess: {
s,ä,ä,gis;
s,ü,ü,gis;
s,ß,ß,gis;
s,ö,ö,gis;
s,Ãœ,Ü,gis;
s,Ö,Ö,gis;
s,Ä,Ä,gis;
s,<a.*?>,,gis;
s,<br.*?>,,gis;
s,<hr.*?>,,gis;
s,<br.*?>,,gis;
}

URL: http://archiv.tagesspiegel.de/ressort.html?r=Politik
Name: TS-Politik
AuthorName: Geoffrey Miller
Levels: 2
# ImageURL: file://localhost/C:/Documents and Settings/Geoffrey/Desktop/.*\.jpg
ContentsStart: <!-- Hier beginnt die Inhaltstabelle -->
ContentsEnd: <!--ENDE Spalte 2, Aktuelles-->
StoryStart: <!-- Hier beginnt der Artikel -->
StoryEnd: <!-- Hier endet der Artikel -->
# StoryURL: http://archiv.tagesspiegel.de/archiv/.*/\d+\.asp
StoryURL: http://archiv.tagesspiegel.de/archiv/.*
StoryHTMLPreProcess: {
# Undos blockquote text - delete if blockquote text is desired
s,<blockquote>,,gis;
s,</blockquote>,,gis;
}
# ContentsHTMLPreProcess: {
# s,([&Uuml;bersicht]),(<img src="file://localhost/C:/Documents and Settings/Geoffrey/Desktop/tsp3.jpg">),gis;
# }

URL: http://archiv.tagesspiegel.de/ressort.html?r=Reise
Name: TS-Reise
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <!-- Hier beginnt die Inhaltstabelle -->
ContentsEnd: <!--ENDE Spalte 2, Aktuelles-->
StoryStart: <!-- Hier beginnt der Artikel -->
StoryEnd: <!-- Hier endet der Artikel -->
StoryURL: http://archiv.tagesspiegel.de/archiv/.*/\d+\.asp
StoryHTMLPreProcess: {
# Undos blockquote text - delete if blockquote text is desired
s,<blockquote>,,gis;
s,</blockquote>,,gis;
}

URL: http://archiv.tagesspiegel.de/ressort.html?r=Sonntag
Name: TS-Sonntag
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <!-- Hier beginnt die Inhaltstabelle -->
ContentsEnd: <!--ENDE Spalte 2, Aktuelles-->
StoryStart: <!-- Hier beginnt der Artikel -->
StoryEnd: <!-- Hier endet der Artikel -->
StoryURL: http://archiv.tagesspiegel.de/archiv/.*/\d+\.asp
StoryHTMLPreProcess: {
# Undos blockquote text - delete if blockquote text is desired
s,<blockquote>,,gis;
s,</blockquote>,,gis;
}


English scoops

URL: http://www.businessweek.com/bwdaily/index.html
Name: BW DailyBriefing
Levels: 2
ContentsStart: <IMG SRC="http://images.businessweek.com/common_images/db_sec.gif
ContentsEnd: <a name="archive">
StoryStart: <!--SECTION-->
StoryStart: <!--STRAP-->
StoryStart: <!--DATE-->
StoryEnd: <!--/STORY-->
StoryURL: http://www.businessweek.com/.*\.htm


# This one mostly by Goh Boon Nam, with a little help from me, kudos to him, see below :-)
URL: http://www.msnbc.com/news/nw-int_front.asp?
Name: NewsweekIntl
Description: Newsweek International
AuthorName: Goh Boon Nam
# Version 1.4
# Date updated : 23 Apr 2004
# Updated by : Goh Boon Nam and Geoffrey Miller
# Changes made : Revamp due to change in design of web site

Levels: 2

ContentsStart: FROM THIS WEEK'S ISSUE
ContentsEnd: FROM THE PREVIOUS ISSUE

StoryURL: http://msnbc.msn.com/id/.*

StoryStart: <div class="headlineStory">
StoryEnd: ©(.*?)Newsweek, Inc

# StoryFollowLinks: 1

# urgh, first article title is an image. Use its alt tag
#UseAltTagForURL: http://(.*?).jpg

ContentsHTMLPreProcess: {
s/align="right"//gim;
s/align="center"//gim;
s/align=right//gim;
s/align=center//gim;
s/Â//gim;
s/—/--/gim;
s/•/<BR>/gim;
s/ //gim;
}

StoryHTMLPreProcess: {
s/align="right"//gim;
s/align="center"//gim;
s/align=right//gim;
s/align=center//gim;
s/Â//gim;
s/—/--/gim;
s/•/<BR>/gim;
s/ //gim;
s/advertisement<br>//gim;
s/<font class="textSmallBold"(.*?)<\/table><\/td><\/tr><\/table>//gis;
}

# Change YOURID and YOURPASSWORD to your own NYT details!
URL:

http://www.nytimes.com/auth/chk_logi...ytimes.com/pag

es/books/review/text/index.html
Name: NYT BookReview
AuthorName: Edited by Geoffrey Miller mostly from Kennis Koldewyn's .site files
# Thanks to Kennis Koldewyn for helping me with the NYT files
# This format will probably work with most other NYT text-only URLs, try it and see
# Post the results when you're finished!
Levels: 2
ContentsStart: <td rowspan="3" width="1" bgcolor="#E3E3E3" valign="top">
ContentsEnd: <IMG src="http://graphics7.nytimes.com/images/misc/spacer.gif" width="459" height="1" border="0"/>
# Contents pre-processing:
ContentsHTMLPreProcess: {
# Change font-hacking into heading:
s,<FONT SIZE="\+1"><STRONG>(.*?)</STRONG></FONT><P></P>,<H1>$1</H1>,gis;

# Change empty paragraphs into breaks:
s,<P></P>,<BR>,gis;
}

# Change YOURID and YOURPASSWORD to your own NYT details!
URL:

http://www.nytimes.com/auth/chk_logi...ytimes.com/pag

es/international/text/index.html
Name: NYT International
AuthorName: Edited by Geoffrey Miller mostly from Kennis Koldewyn's .site files
# Thanks to Kennis Koldewyn for helping me with the NYT files
# This format will probably work with most other NYT text-only URLs, try it and see
# Post the results when you're finished!
Levels: 2
ContentsStart: <td rowspan="3" width="1" bgcolor="#E3E3E3" valign="top">
ContentsEnd: <IMG src="http://graphics7.nytimes.com/images/misc/spacer.gif" width="459" height="1" border="0"/>
# Contents pre-processing:
ContentsHTMLPreProcess: {
# Change font-hacking into heading:
s,<FONT SIZE="\+1"><STRONG>(.*?)</STRONG></FONT><P></P>,<H1>$1</H1>,gis;

# Change empty paragraphs into breaks:
s,<P></P>,<BR>,gis;
}


# Change YOURID and YOURPASSWORD to your own NYT details!
URL:

http://www.nytimes.com/auth/chk_logi...ytimes.com/pag

es/magazine/text/index.html
Name: NYT Magazine
AuthorName: Edited by Geoffrey Miller mostly from Kennis Koldewyn's .site files
# Thanks to Kennis Koldewyn for helping me with the NYT files
# This format will probably work with most other NYT text-only URLs, try it and see
# Post the results when you're finished!
Levels: 2
ContentsStart: <td rowspan="3" width="1" bgcolor="#E3E3E3" valign="top">
ContentsEnd: <IMG src="http://graphics7.nytimes.com/images/misc/spacer.gif" width="459" height="1" border="0"/>
# Contents pre-processing:
ContentsHTMLPreProcess: {
# Change font-hacking into heading:
s,<FONT SIZE="\+1"><STRONG>(.*?)</STRONG></FONT><P></P>,<H1>$1</H1>,gis;

# Change empty paragraphs into breaks:
s,<P></P>,<BR>,gis;
}

StoryURL: http://www.nytimes.com/.*\.html.*
StoryStart: <NYT_HEADLINE
StoryEnd: </NYT_TEXT
StoryToPrintableSub: s,(.*),$1?position=&pagewanted=print&position=,


# Story pre-processing:
StoryHTMLPreProcess: {
# Remove lists of online links, inline tables, inline images, etc.:
s,<NYT_AD.*?</NYT_ADD>,,gis;
s,<NYT_BANNER.*?</NYT_BANNER>,,gis;
s,<NYT_INLINEBLURB.*?</?NYT_INLINEBLURB>,,gis;
s,<NYT_INLINEIMAGE.*?</?NYT_INLINEIMAGE>,,gis;
s,<NYT_INLINETABLE.*?</?NYT_INLINETABLE>,,gis;
s,<NYT_LINKS.*?</NYT_LINKS>,,gis;
s,<NYT_LINKS_ONSITE.*?</?NYT_LINKS_ONSITE>,,gis;
s,<NYT_LINKS_OFFSITE.*?</?NYT_LINKS_OFFSITE>,,gis;

# Remove other NYT-specific tags:
s,<\/?NYT_.*?>,,gim;
}

# Change YOURID and YOURPASSWORD to your own NYT details!
URL:

http://www.nytimes.com/auth/chk_logi...ytimes.com/pag

es/travel/sundaytravel/text/index.html
Name: NYT SunTravel
AuthorName: Edited by Geoffrey Miller mostly from Kennis Koldewyn's .site files
# Thanks to Kennis Koldewyn for helping me with the NYT files
# This format will probably work with most other NYT text-only URLs, try it and see
# Post the results when you're finished!
Levels: 2
ContentsStart: <td rowspan="3" width="1" bgcolor="#E3E3E3" valign="top">
ContentsEnd: <IMG src="http://graphics7.nytimes.com/images/misc/spacer.gif" width="469" height="1" border="0"/>

# Contents pre-processing:
ContentsHTMLPreProcess: {
# Change font-hacking into heading:
s,<FONT SIZE="\+1"><STRONG>(.*?)</STRONG></FONT><P></P>,<H1>$1</H1>,gis;

# Change empty paragraphs into breaks:
s,<P></P>,<BR>,gis;
}

StoryURL: http://www.nytimes.com/.*\.html.*
StoryStart: <NYT_HEADLINE
StoryEnd: </NYT_TEXT
StoryToPrintableSub: s,(.*),$1?position=&pagewanted=print&position=,

# Story pre-processing:
StoryHTMLPreProcess: {
# Remove lists of online links, inline tables, inline images, etc.:
s,<NYT_AD.*?</NYT_ADD>,,gis;
s,<NYT_BANNER.*?</NYT_BANNER>,,gis;
s,<NYT_INLINEBLURB.*?</?NYT_INLINEBLURB>,,gis;
s,<NYT_INLINEIMAGE.*?</?NYT_INLINEIMAGE>,,gis;
s,<NYT_INLINETABLE.*?</?NYT_INLINETABLE>,,gis;
s,<NYT_LINKS.*?</NYT_LINKS>,,gis;
s,<NYT_LINKS_ONSITE.*?</?NYT_LINKS_ONSITE>,,gis;
s,<NYT_LINKS_OFFSITE.*?</?NYT_LINKS_OFFSITE>,,gis;

# Remove other NYT-specific tags:
s,<\/?NYT_.*?>,,gim;
}

# Change YOURID and YOURPASSWORD to your own NYT details!
URL:

http://www.nytimes.com/auth/chk_logi...ytimes.com/pag

es/travel/text/index.html
Name: NYT Travel
AuthorName: Edited by Geoffrey Miller mostly from Kennis Koldewyn's .site files
# Thanks to Kennis Koldewyn for helping me with the NYT files
# This format will probably work with most other NYT text-only URLs, try it and see
# Post the results when you're finished!
Levels: 2
ContentsStart: <!--END NAV -->
ContentsEnd: <IMG src="http://graphics7.nytimes.com/images/misc/spacer.gif" width="5" height="25" border="0"/>

# Contents pre-processing:
ContentsHTMLPreProcess: {
# Change font-hacking into heading:
s,<FONT SIZE="\+1"><STRONG>(.*?)</STRONG></FONT><P></P>,<H1>$1</H1>,gis;

# Change empty paragraphs into breaks:
s,<P></P>,<BR>,gis;
s,<br/><br/>,<p>,gis;
s,&,+,gis;
}

StoryURL: http://travel2.nytimes.com/.*\.html.*
StoryStart: <NYT_HEADLINE
StoryEnd: </NYT_TEXT
StoryToPrintableSub: s,(.*),$1?&pagewanted=print&position=,

# Story pre-processing:
StoryHTMLPreProcess: {
# Remove lists of online links, inline tables, inline images, etc.:
s,<NYT_AD.*?</NYT_ADD>,,gis;
s,<NYT_BANNER.*?</NYT_BANNER>,,gis;
s,<NYT_INLINEBLURB.*?</?NYT_INLINEBLURB>,,gis;
s,<NYT_INLINEIMAGE.*?</?NYT_INLINEIMAGE>,,gis;
s,<NYT_INLINETABLE.*?</?NYT_INLINETABLE>,,gis;
s,<NYT_LINKS.*?</NYT_LINKS>,,gis;
s,<NYT_LINKS_ONSITE.*?</?NYT_LINKS_ONSITE>,,gis;
s,<NYT_LINKS_OFFSITE.*?</?NYT_LINKS_OFFSITE>,,gis;

# Remove other NYT-specific tags:
s,<\/?NYT_.*?>,,gim;
}

# Change YOURID and YOURPASSWORD to your own NYT details!
URL:

http://www.nytimes.com/auth/chk_logi...ytimes.com/pag

es/weekinreview/text/index.html
Name: NYT WeekReview
AuthorName: Edited by Geoffrey Miller mostly from Kennis Koldewyn's .site files
# Thanks to Kennis Koldewyn for helping me with the NYT files
# This format will probably work with most other NYT text-only URLs, try it and see
# Post the results when you're finished!
Levels: 2
ContentsStart: <td rowspan="3" width="1" bgcolor="#E3E3E3" valign="top">
ContentsEnd: <IMG src="http://graphics7.nytimes.com/images/misc/spacer.gif" width="459" height="1" border="0"/>
# ContentsEnd: <IMG src="http://graphics7.nytimes.com/images/misc/spacer.gif" width="469" height="1" border="0"/>

# Contents pre-processing:
ContentsHTMLPreProcess: {
# Change font-hacking into heading:
s,<FONT SIZE="\+1"><STRONG>(.*?)</STRONG></FONT><P></P>,<H1>$1</H1>,gis;

# Change empty paragraphs into breaks:
s,<P></P>,<BR>,gis;
}

URL: http://www.usatoday.com/usatonline/newsindex11.htm
Levels: 2
Name: USANews
StoryURL: http://www.usatoday.com/usatonline/.*
ContentsStart: <!-- Begin Index -->
ContentsEnd: <!-- END DATA -->
StoryStart: <!-- BEGIN DATA -->
StoryEnd: <!-- End Story -->
StorySkipURL: http://www.usatoday.com/usatonline/newsindex12.htm

URL: http://www.usatoday.com/usatonline/newsindex12.htm
Levels: 2
Name: USANews2
StoryURL: http://www.usatoday.com/usatonline/.*
ContentsStart: <!-- Begin Index -->
ContentsEnd: <!-- END DATA -->
StoryStart: <!-- BEGIN DATA -->
StoryEnd: <!-- End Story -->
StorySkipURL: http://www.usatoday.com/usatonline/newsindex11.htm

URL: http://www.usatoday.com/usatonline/lifeindex11.htm
Levels: 2
Name: USALife
StoryURL: http://www.usatoday.com/usatonline/.*
ContentsStart: <!-- Begin Index -->
ContentsEnd: <!-- END DATA -->
StoryStart: <!-- BEGIN DATA -->
StoryEnd: <!-- End Story -->
StorySkipURL: http://www.usatoday.com/usatonline/lifeindex12.htm

URL: http://www.usatoday.com/usatonline/lifeindex12.htm
Levels: 2
Name: USALife2
StoryURL: http://www.usatoday.com/usatonline/.*
ContentsStart: <!-- Begin Index -->
ContentsEnd: <!-- END DATA -->
StoryStart: <!-- BEGIN DATA -->
StoryEnd: <!-- End Story -->
StorySkipURL: http://www.usatoday.com/usatonline/lifeindex11.htm
geoffreynz is offline  
Old 08-02-2004, 08:17 AM   #2
Colin Dunstan
Is papyrophobic!
Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.
 
Colin Dunstan's Avatar
 
Posts: 1,926
Karma: 1009999
Join Date: Aug 2003
Location: USA
Device: Dell Axim
My goodness! This is some great work... thank you!

/me goes and installs the msnbc scoop
Colin Dunstan is offline  
Old 08-02-2004, 08:31 AM   #3
ignatz
mechanoholic
ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.
 
ignatz's Avatar
 
Posts: 582
Karma: 1000217
Join Date: Mar 2004
Location: Sarasota, FL
Device: Nook STR/iPhone 4S/EVO 4G
Thanks geoffrey, this is fantastic! And thanks for double posting to the sitescooper mailing list and here at mobileread. I'm going to be digging through this list in the short term (at least the English pages!).
ignatz is offline  
Old 08-02-2004, 02:48 PM   #4
Alexander Turcic
Fully Converged
Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.
 
Alexander Turcic's Avatar
 
Posts: 17,107
Karma: 10559284
Join Date: Oct 2002
Location: Switzerland
Device: Sony PRS-650 / Nexus 7 / Kindle PW
Thanks a lot Reminds me to spend some more time on Sitescooper again!
Alexander Turcic is offline  
Old 08-03-2004, 01:34 AM   #5
geoffreynz
Member
geoffreynz began at the beginning.
 
Posts: 17
Karma: 44
Join Date: Jul 2004
Device: Palm m515
Thumbs up

Glad that I could contribute I also made NYT ones for National and Arts, like I said, they're virtually identical apart from the URLs. I just fixed up "Die Zeit" at the weekend, it rocks! Enjoy and please post any new site files you make. I really want site files for the Washington Post. Because of the requirement to register, this could be difficult, but there must be a way somehow, like there was for the NYT. Does anyone have any ideas?

Geoffrey

INDEX:
Die Zeit x5
gms Reise
NYT x2
Sunday Herald x7

URL: http://www.zeit.de/feuilleton/index
Name: Zeit Feuilleton
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <td width="20px"><img alt="" src="http://zeus.zeit.de/images/transparent_pixel.gif" width="20px"
ContentsEnd: src="http://zeus.zeit.de/images/transparent_pixel.gif" valign="bottom" align="right" width="65"
StoryURL: http://www.zeit.de/\d+/\d+/.*
StoryStart: <div class="text">
StoryEnd: <p class="mainnavigation">
StoryHTMLPreProcess: {
s,..8222;,„,gis;
}
ContentsHTMLPreProcess: {
s,..8222;,„,gis;
}
# ImageURL: http://zeus.zeit.de/bilder/\d+/\d+/.*/.*\.jpg
# ImageURL: http://zeus.zeit.de/bilder/\d+/\d+/politik/.*\.gif

URL: http://www.zeit.de/literatur/index
Name: Zeit Literatur
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <td width="20px"><img alt="" src="http://zeus.zeit.de/images/transparent_pixel.gif" width="20px"
ContentsEnd: src="http://zeus.zeit.de/images/transparent_pixel.gif" valign="bottom" align="right" width="65"
StoryURL: http://www.zeit.de/\d+/\d+/.*
StoryStart: <div class="text">
StoryEnd: <p class="mainnavigation">
StoryHTMLPreProcess: {
s,..8222;,„,gis;
}
ContentsHTMLPreProcess: {
s,..8222;,„,gis;
}
# ImageURL: http://zeus.zeit.de/bilder/\d+/\d+/.*/.*\.jpg
# ImageURL: http://zeus.zeit.de/bilder/\d+/\d+/politik/.*\.gif

URL: http://www.zeit.de/politik/
Name: Zeit Politik
AuthorName: Geoffrey Miller
Levels: 2
# ContentsStart: <td width="20px"><img alt=""
ContentsStart: <img alt="" border="0" src="http://zeus.zeit.de/bilder/elemente/aktuelle_ausgabe_386.gif" align="center" vspace="0" width="386" class="teaserimage">
# ContentsEnd: <td width="100%">
ContentsEnd: <td width="140px" valign="top">
StoryURL: http://www.zeit.de/\d+/\d+/.*
StoryStart: <div class="text">
StoryEnd: <p class="mainnavigation">
# StoryEnd: <p class="mainnavigation"><a href="#top">ZUM ARTIKELANFANG</a></p>
StoryHTMLPreProcess: {
s,..8222;,„,gis;
}
ContentsHTMLPreProcess: {
s,..8222;,„,gis;
}
# ImageURL: http://zeus.zeit.de/bilder/\d+/\d+/.*/.*\.jpg
# ImageURL: http://zeus.zeit.de/bilder/\d+/\d+/politik/.*\.gif

URL: http://www.zeit.de/reisen/index
Name: Zeit Reisen
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <td width="20px"><img alt="" src="http://zeus.zeit.de/images/transparent_pixel.gif" width="20px"
ContentsEnd: src="http://zeus.zeit.de/images/transparent_pixel.gif" valign="bottom" align="right" width="65"
StoryURL: http://www.zeit.de/\d+/\d+/.*
StoryStart: <div class="text">
StoryEnd: <p class="mainnavigation">
# StoryEnd: <p class="mainnavigation"><a href="#top">ZUM ARTIKELANFANG</a></p>
StoryHTMLPreProcess: {
s,..8222;,„,gis;
}
ContentsHTMLPreProcess: {
s,..8222;,„,gis;
}
# ImageURL: http://zeus.zeit.de/bilder/\d+/\d+/.*/.*\.jpg
# ImageURL: http://zeus.zeit.de/bilder/\d+/\d+/politik/.*\.gif

URL: http://www.zeit.de/wirtschaft/index
Name: Zeit Wirtschaft
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <td width="20px"><img alt="" src="http://zeus.zeit.de/images/transparent_pixel.gif" width="20px"
ContentsEnd: src="http://zeus.zeit.de/images/transparent_pixel.gif" valign="bottom" align="right" width="65"
StoryURL: http://www.zeit.de/\d+/\d+/.*
StoryStart: <div class="text">
StoryEnd: <p class="mainnavigation"><a href="#top">ZUM ARTIKELANFANG</a></p>
StoryHTMLPreProcess: {
s,..8222;,„,gis;
}
ContentsHTMLPreProcess: {
s,..8222;,„,gis;
}
# ImageURL: http://zeus.zeit.de/bilder/\d+/\d+/.*/.*\.jpg
# ImageURL: http://zeus.zeit.de/bilder/\d+/\d+/politik/.*\.gif

URL: http://www.ikz-online.de/ikz/ikz.rei...ueberblick.php
Levels: 2
Name: gms Reise
ContentsStart: header.berichte2.gif
ContentsEnd: <!-- Ende - Z_2sp_dpa_Uebers_Fortl_SQL -->
StoryURL: http://www.ikz-online.de/.*
StoryStart: <!-- Ende - Z_2sp_Multicom_Lang_SQL -->
StoryEnd: <span class="contentfliess">
ImageURL: http://www.ikz-online.de/includes/bi....php?.*.nitf.*

# Change YOURID and YOURPASSWORD to your own NYT details!
URL: http://www.nytimes.com/auth/chk_logi...ext/index.html

Name: NYT Arts
AuthorName: Edited by Geoffrey Miller mostly from Kennis Koldewyn's .site files
# Thanks to Kennis Koldewyn for helping me with the NYT files
# This format will probably work with most other NYT text-only URLs, try it and see
# Post the results when you're finished!
Levels: 2
ContentsStart: <td rowspan="3" width="1" bgcolor="#E3E3E3" valign="top">
ContentsEnd: <IMG src="http://graphics7.nytimes.com/images/misc/spacer.gif" width="459" height="1" border="0"/>

# Contents pre-processing:
ContentsHTMLPreProcess: {
# Change font-hacking into heading:
s,<FONT SIZE="\+1"><STRONG>(.*?)</STRONG></FONT><P></P>,<H1>$1</H1>,gis;

# Change empty paragraphs into breaks:
s,<P></P>,<BR>,gis;
}


StoryURL: http://www.nytimes.com/.*\.html.*


StoryStart: <NYT_HEADLINE
StoryEnd: </NYT_TEXT
StoryToPrintableSub: s,(.*),$1?position=&pagewanted=print&position=,


# Story pre-processing:
StoryHTMLPreProcess: {
# Remove lists of online links, inline tables, inline images, etc.:
s,<NYT_AD.*?</NYT_ADD>,,gis;
s,<NYT_BANNER.*?</NYT_BANNER>,,gis;
s,<NYT_INLINEBLURB.*?</?NYT_INLINEBLURB>,,gis;
s,<NYT_INLINEIMAGE.*?</?NYT_INLINEIMAGE>,,gis;
s,<NYT_INLINETABLE.*?</?NYT_INLINETABLE>,,gis;
s,<NYT_LINKS.*?</NYT_LINKS>,,gis;
s,<NYT_LINKS_ONSITE.*?</?NYT_LINKS_ONSITE>,,gis;
s,<NYT_LINKS_OFFSITE.*?</?NYT_LINKS_OFFSITE>,,gis;

# Remove other NYT-specific tags:
s,<\/?NYT_.*?>,,gim;
}


# Change YOURID and YOURPASSWORD to your own NYT details!
URL: http://www.nytimes.com/auth/chk_logi...ext/index.html

Name: NYT National
AuthorName: Edited by Geoffrey Miller mostly from Kennis Koldewyn's .site files
# Thanks to Kennis Koldewyn for helping me with the NYT files
# This format will probably work with most other NYT text-only URLs, try it and see
# Post the results when you're finished!
Levels: 2
ContentsStart: <td rowspan="3" width="1" bgcolor="#E3E3E3" valign="top">
ContentsEnd: <IMG src="http://graphics7.nytimes.com/images/misc/spacer.gif" width="459" height="1" border="0"/>


# Contents pre-processing:
ContentsHTMLPreProcess: {
# Change font-hacking into heading:
s,<FONT SIZE="\+1"><STRONG>(.*?)</STRONG></FONT><P></P>,<H1>$1</H1>,gis;

# Change empty paragraphs into breaks:
s,<P></P>,<BR>,gis;
}


StoryURL: http://www.nytimes.com/.*\.html.*


StoryStart: <NYT_HEADLINE
StoryEnd: </NYT_TEXT
StoryToPrintableSub: s,(.*),$1?position=&pagewanted=print&position=,


# Story pre-processing:
StoryHTMLPreProcess: {
# Remove lists of online links, inline tables, inline images, etc.:
s,<NYT_AD.*?</NYT_ADD>,,gis;
s,<NYT_BANNER.*?</NYT_BANNER>,,gis;
s,<NYT_INLINEBLURB.*?</?NYT_INLINEBLURB>,,gis;
s,<NYT_INLINEIMAGE.*?</?NYT_INLINEIMAGE>,,gis;
s,<NYT_INLINETABLE.*?</?NYT_INLINETABLE>,,gis;
s,<NYT_LINKS.*?</NYT_LINKS>,,gis;
s,<NYT_LINKS_ONSITE.*?</?NYT_LINKS_ONSITE>,,gis;
s,<NYT_LINKS_OFFSITE.*?</?NYT_LINKS_OFFSITE>,,gis;

# Remove other NYT-specific tags:
s,<\/?NYT_.*?>,,gim;
}

URL: http://www.sundayherald.com/newshome.shtml
Name: SH-News
Description: Scottish Sunday newspaper
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <!-- content -->
ContentsEnd: <td width="10" bgcolor="FFFFFF"></td>
StoryURL: http://www.sundayherald.com/.*\d+\.*
StoryStart: <table width="100%"
StoryEnd: Back to previous page
#StoryStart: <div class="headline">
#StoryEnd: <script language="JavaScript">
StoryToPrintableSub: s,^(http://www.sundayherald.com)/(\d+)\S*,\1/print\2,

#StoryStart: <div class="bodyTextPrint">
#StoryEnd: <a href="javascript:history.back()">Back to previous page</a>
#StorytoPrintableSub: s,(.*),http://www.sundayherald/print\d+,

URL: http://www.sundayherald.com/sevendayshome.shtml
Name: SH-7days
Description: Scottish Sunday newspaper
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <!-- content -->
ContentsEnd: <td width="10" bgcolor="FFFFFF"></td>
StoryURL: http://www.sundayherald.com/.*\d+\.*
StoryStart: <table width="100%"
StoryEnd: Back to previous page
#StoryStart: <div class="headline">
#StoryEnd: <script language="JavaScript">
StoryToPrintableSub: s,^(http://www.sundayherald.com)/(\d+)\S*,\1/print\2,

#StoryStart: <div class="bodyTextPrint">
#StoryEnd: <a href="javascript:history.back()">Back to previous page</a>
#StorytoPrintableSub: s,(.*),http://www.sundayherald/print\d+,

URL: http://www.sundayherald.com/businesshome.shtml
Name: SH-Business
Description: Scottish Sunday newspaper
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <!-- content -->
ContentsEnd: <td width="10" bgcolor="FFFFFF"></td>
StoryURL: http://www.sundayherald.com/.*\d+\.*
StoryStart: <table width="100%"
StoryEnd: Back to previous page
#StoryStart: <div class="headline">
#StoryEnd: <script language="JavaScript">
StoryToPrintableSub: s,^(http://www.sundayherald.com)/(\d+)\S*,\1/print\2,

#StoryStart: <div class="bodyTextPrint">
#StoryEnd: <a href="javascript:history.back()">Back to previous page</a>
#StorytoPrintableSub: s,(.*),http://www.sundayherald/print\d+,

URL: http://www.sundayherald.com/focushome.shtml
Name: SH-NewsFocus
Description: Scottish Sunday newspaper
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <!-- content -->
ContentsEnd: <td width="10" bgcolor="FFFFFF"></td>
StoryURL: http://www.sundayherald.com/.*\d+\.*
StoryStart: <table width="100%"
StoryEnd: Back to previous page
#StoryStart: <div class="headline">
#StoryEnd: <script language="JavaScript">
StoryToPrintableSub: s,^(http://www.sundayherald.com)/(\d+)\S*,\1/print\2,

#StoryStart: <div class="bodyTextPrint">
#StoryEnd: <a href="javascript:history.back()">Back to previous page</a>
#StorytoPrintableSub: s,(.*),http://www.sundayherald/print\d+,

URL: http://www.sundayherald.com/internationalhome.shtml
Name: SH-World
Description: Scottish Sunday newspaper
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <!-- content -->
ContentsEnd: <td width="10" bgcolor="FFFFFF"></td>
StoryURL: http://www.sundayherald.com/.*\d+\.*
StoryStart: <table width="100%"
StoryEnd: Back to previous page
#StoryStart: <div class="headline">
#StoryEnd: <script language="JavaScript">
StoryToPrintableSub: s,^(http://www.sundayherald.com)/(\d+)\S*,\1/print\2,

#StoryStart: <div class="bodyTextPrint">
#StoryEnd: <a href="javascript:history.back()">Back to previous page</a>
#StorytoPrintableSub: s,(.*),http://www.sundayherald/print\d+,

URL: http://www.sundayherald.com/magazinehome.shtml
Name: SH-Magazine
Description: Scottish Sunday newspaper
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <!-- content -->
ContentsEnd: <td width="10" bgcolor="FFFFFF"></td>
StoryURL: http://www.sundayherald.com/.*\d+\.*
StoryStart: <table width="100%"
StoryEnd: Back to previous page
#StoryStart: <div class="headline">
#StoryEnd: <script language="JavaScript">
StoryToPrintableSub: s,^(http://www.sundayherald.com)/(\d+)\S*,\1/print\2,

#StoryStart: <div class="bodyTextPrint">
#StoryEnd: <a href="javascript:history.back()">Back to previous page</a>
#StorytoPrintableSub: s,(.*),http://www.sundayherald/print\d+,

URL: http://www.sundayherald.com/reviewhome.shtml
Name: SH-Review
Description: Scottish Sunday newspaper
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <!-- content -->
ContentsEnd: <td width="10" bgcolor="FFFFFF"></td>
StoryURL: http://www.sundayherald.com/.*\d+\.*
StoryStart: <table width="100%"
StoryEnd: Back to previous page
#StoryStart: <div class="headline">
#StoryEnd: <script language="JavaScript">
StoryToPrintableSub: s,^(http://www.sundayherald.com)/(\d+)\S*,\1/print\2,

#StoryStart: <div class="bodyTextPrint">
#StoryEnd: <a href="javascript:history.back()">Back to previous page</a>
#StorytoPrintableSub: s,(.*),http://www.sundayherald/print\d+,
geoffreynz is offline  
 

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
3 french language sites with lots of e-books Liviu_5 Reading Recommendations 13 08-28-2012 04:29 AM
French + English = Globish Patricia Lounge 101 12-30-2010 05:02 PM
French-English Dictionary Ebenist Amazon Kindle 2 04-11-2010 04:10 AM
Changing from french language to english escalla Sony Reader 3 05-14-2009 12:57 PM
French, Spanish, German, Russian ... and English menus on PRS-505 porkupan Sony Reader 2 08-10-2008 05:16 PM


All times are GMT -4. The time now is 08:07 PM.


MobileRead.com is a privately owned, operated and funded community.