View Single Post
Old 08-20-2005, 07:40 PM   #1
geoffreynz
Member
geoffreynz began at the beginning.
 
Posts: 17
Karma: 44
Join Date: Jul 2004
Device: Palm m515
Thumbs up ITALIAN: Reuters Italia/Panorama

Hi all,

Here are two Italian sitescooper .site files I've made. Some of the PreProcess lines are probably unnecessary, but I used my Die Zeit files as a template. Otherwise, special characters turn out lòßike this on my Palm!

Here's Reuters Italia top news:

Code:
URL: http://today.reuters.it/news/default.aspx
Name: Reuters Italienisch
Description: Reuters Nachrichten auf Italienisch
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <!-- Left Column -->
ContentsEnd: <td colspan="2"><img border="0" src="images/clear.gif"
StoryURL: http://today.reuters.it/PrinterFriendlyPopup.aspx\?type=topNews\&storyID=.*
StoryToPrintableSub: s,^http://today.reuters.it/news/.ewsArticle.aspx\?type=topNews\&storyID=(.*)\S*,http://today.reuters.it/PrinterFriendlyPopup.aspx\?type=topNews\&storyID=uri\%3a$1,
StoryStart: <td id="StoryDataCell" colspan="2" valign="top">
StoryEnd: <td colspan="2" align="right">
ContentsUseTableSmarts: 1
ImageURL: .*\.jpg.*
ContentsHTMLPreProcess: {
s,ò,ò,gis;
s,é,é,gis;
s,ù,ù,gis;
s,ì,ì,gis;
s,è,è,gis;
s,Ã ,à,gis;
s,©,©,gis; 
s,ä,ä,gis;
s,ü,ü,gis;
s,ß,ß,gis;
s,ö,ö,gis;
s,Ãœ,Ü,gis;
s,Ö,Ö,gis;
s,Ä,Ä,gis;
s,<span class="artTitle">(.*)</span>,<b>$1</b>,gis;
s,<span class="newsDate">(.*)</span>,<br><i>$2<i>,gis;
s,<td colspan="2" valign="top" class="medium">,<HR>,gis;
}
StoryHTMLPreProcess: {
s,ò,ò,gis;
s,à©,é,gis;
s,ù,ù,gis;
s,ì,ì,gis;
s,è,è,gis;
s,Ã ,à,gis;
s,©,©,gis;
s,ä,ä,gis;
s,ü,ü,gis;
s,ß,ß,gis;
s,ö,ö,gis;
s,Ãœ,Ü,gis;
s,Ö,Ö,gis;
s,Ä,Ä,gis;
s,<a.*?>,,gis;
s,<br.*?>,,gis;
s,<hr.*?>,,gis;
s,<br.*?>,,gis;
s,<span class="artTitle">(.*)</span>,<b>$1</b>,gis;
s,<span class="newsDate">(.*)</span>,<br><i>$2<i>,gis;
s,<td colspan="2" valign="top" class="medium">,<HR>,gis;
}
and Panorama:

Code:
URL: http://www.panorama.it/mondo/index.html
Name: Panorama Mondo
AuthorName: Geoffrey Miller
Levels: 2
ContentsStart: <!-- Riga 2 -->
ContentsEnd: <!-- fine Riga 9 -->
StoryURL: http://www.panorama.it/home/stampa/articolo/ix1-.*/idxsl1-stampaarticolo
StoryToPrintableSub: s,^http://www.panorama.it/.*/.*/articolo/(ix1-.*)\S*,http://www.panorama.it/home/stampa/articolo/$1/idxsl1-stampaarticolo,
StoryStart: <span class="txtliv3b">
StoryEnd: <!-- fine Riga 1 -->
ContentsUseTableSmarts: 0
ContentsHTMLPreProcess: {
s,ò,ò,gis;
s,é,é,gis;
s,ù,ù,gis;
s,ì,ì,gis;
s,è,è,gis;
s,Ã ,à,gis;
s,©,©,gis; 
s,ä,ä,gis;
s,ü,ü,gis;
s,ß,ß,gis;
s,ö,ö,gis;
s,Ãœ,Ü,gis;
s,Ö,Ö,gis;
s,Ä,Ä,gis;
s,<span class="artTitle">(.*)</span>,<b>$1</b>,gis;
s,<span class="newsDate">(.*)</span>,<br><i>$2<i>,gis;
s,<td colspan="2" valign="top" class="medium">,<HR>,gis;
}
StoryHTMLPreProcess: {
s,ò,ò,gis;
s,à©,é,gis;
s,ù,ù,gis;
s,ì,ì,gis;
s,è,è,gis;
s,Ã ,à,gis;
s,©,©,gis;
s,ä,ä,gis;
s,ü,ü,gis;
s,ß,ß,gis;
s,ö,ö,gis;
s,Ãœ,Ü,gis;
s,Ö,Ö,gis;
s,Ä,Ä,gis;
s,<a.*?>,,gis;
s,<br.*?>,,gis;
s,<hr.*?>,,gis;
s,<br.*?>,,gis;
s,<span class="artTitle">(.*)</span>,<b>$1</b>,gis;
s,<span class="newsDate">(.*)</span>,<br><i>$2<i>,gis;
s,<td colspan="2" valign="top" class="medium">,<HR>,gis;
}
Hope somebody finds them of use.

Regards,

Geoffrey
geoffreynz is offline