View Single Post
Old 04-03-2004, 03:42 AM   #7
Alexander Turcic
Fully Converged
Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.
 
Alexander Turcic's Avatar
 
Posts: 18,163
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
I've improved the original economics scoop somewhat to exclude unwanted content.

Code:
URL: http://www.economist.com/
Name: Economist
Description: Economist
AuthorName: Goh Boon Nam

# General Settings
Active: 1
SizeLimit: 2000
Levels: 2

# Image Settings
ImageURL: http://www.economist.com/images/dingbats/e5.gif
ImageURL: http://www.economist.com/images/\d+/.*
UseAltTagForURL: 0

# Content Settings
ContentsStart: <td colspan="7" width="447" valign="top">
ContentsEnd: <a href="/diversions/quiz/">
ContentsUseTableSmarts: 0 

# Story Settings
StoryToPrintableSub: s!displayStory.cfm!PrinterFriendly.cfm!
StoryURL: http://www.economist.com/(.*?)/PrinterFriendly.cfm(.*?)

# PreProcess Settings
ContentsHTMLPreProcess: {
	# remove ads...hope that's not killing it when layout changes
	s,<div align="center">[^<]<a href="/printedition/">.*<td width="209" valign="top" height="1700">,</font>,gim;
	# remove the 'More from...' Links
	s,<div align="right"><b><a href="[^"]*"><font[^>]*>[^/]*</font></a></b></div><br>,,gim;
	# remove the 'More reviews...' Links
	s,<div align="right"><font[^>]*><b><a href="[^"]*"><font color="[^"]*">More reviews</font></a></b></div>,,gim;
	# gfx -> txt headers
	s,<a href="[^"]*"><img src="/images/sections/(\w+)\.gif"[^>]*></a><br><br>,<hr>Section: $1<br>,gim;
	# gfx -> txt header "markets2
	s,<p><a href="[^"]*"><img alt="MARKETS" border="0" src="/images/sections/m-d\.gif" width="207" height="19"></a></p>,<hr>Section: markets<br>,gim;
	# remove links to pay-content
	s,<a href="[^"]*">([^<]*)</a></b>\s<img alt="E\+" width="17" height="10" border="0" src="/images/dingbats/e5\.gif">,$1<font size=1>(pay-content)</font></b><img width="17" height="10" border="0" src="/images/dingbats/e5\.gif">,gim;
	# remove links to pay-content
	s,<a href="[^"]*">([^<]*)</a></b>\s<img src="/images/dingbats/e5\.gif" alt="" />,$1<font size=1>(pay-content)</font></b><img width="17" height="10" border="0" src="/images/dingbats/e5\.gif">,gim;
	# remove Also-on-the-site... column
	s,<img alt="also on the site ...".*</td></tr></table><br>,,gim;
}

StoryHTMLPreProcess: {
	# remove 'get article background...'
	s,<p>.*<a target="background"[^>]*">.*background</b></font></a></font></p><!--back-->,,gim;
	s/align="right"//gim;
	s/align="center"//gim;
	s/align=right//gim;
	s/align=center//gim;
}
Greets

Alex
Attached Files
File Type: site economist.site (2.2 KB, 936 views)
Alexander Turcic is offline