Quote:
Originally Posted by PizzaMan1979
There are some medical sites very suited for raiding, that a lot of doctors (myself included) would love to have on their palm. In fact i just reaped the html files and was about to process them to iSilo, when i came across siteraider. Is it technically possible to make scripts for
www.wrongdiagnosis.com
and
www.whonamedit.com
I'd be very grateful. arendhamming a gmail d com
|
For the record: i made a script several years ago, which was available through the tomeraider site.
It looked like this:
Code:
// Last Update : September 14, 2005 By PizzaMan
BEGIN_DEF
RS_VERSION=1.001
START_URL="http://www.whonamedit.com/azeponyms.cfm/A.html"
BASE_URL = "www.whonamedit.com"
SOURCE="Who Named It"
OUTPUT_TO_FILE_NAME = "Who Named It"
CALL Start
END
//--------------------------------------------------
BEGIN_PROCESS Start
CALL RaidDocsPage
REPEAT_FOR_ALL_LINKS LIST_1 FormatLinks1
// TEST_LINK_PROCESSING LIST_1
REPEAT_FOR_ALL_LINKS LIST_1 RaidDoctorsLinks
REPEAT_FOR_ALL_LINKS LIST_2 FormatLinks2
// TEST_LINK_PROCESSING LIST_2
REPEAT_FOR_ALL_LINKS LIST_2 RaidDocsArticle
END
//--------------------------------------------------
BEGIN_PROCESS RaidDocsPage
DOWNLOAD_PAGE
INCLUDE_LINKS = "az"
// INCLUDE_LINKS = "a."
GET_LINKS LIST_1
END
//--------------------------------------------------
BEGIN_PROCESS RaidDoctorsLinks
DOWNLOAD_PAGE
INCLUDE_LINKS = "doctor"
INCLUDE_LINKS = "synd"
// INCLUDE_LINKS = "34"
GET_LINKS LIST_2
END
//--------------------------------------------------
BEGIN_PROCESS FormatLinks1
VAR = LINK
VAR_REPLACE "www.whonamedit.comazlist.cfm/" WITH "http://www.whonamedit.com/azlist.cfm/"
VAR_REPLACE "www.whonamedit.comazeponyms.cfm/" WITH "http://www.whonamedit.com/azeponyms.cfm/"
LINK = VAR
END
//--------------------------------------------------
BEGIN_PROCESS FormatLinks2
VAR = LINK
VAR_REPLACE "www.whonamedit.comdoctor.cfm/" WITH "http://www.whonamedit.com/doctor.cfm/"
VAR_REPLACE "www.whonamedit.comsynd.cfm/" WITH "http://www.whonamedit.com/synd.cfm/"
LINK = VAR
END
//--------------------------------------------------
BEGIN_PROCESS RaidDocsArticle
// Download page
DOWNLOAD_PAGE
// Article body
ARTICLE_FROM "<span c" TO "</td>"
//Acquire Title
VAR = BODY
VAR_REMOVE_FROM VAR_START TO ">"
VAR_REMOVE_FROM "<" TO VAR_END
TITLE = VAR
//Article author
VAR = BODY
IF_VAR_CONTAINS "thank" SetAuthor1
//Article body
VAR = BODY
IF_VAR_CONTAINS "ssociated person" SetTitleSynd
IF_VAR_CONTAINS "ssociated epo" SetTitleDoc
VAR_REMOVE_FROM VAR_START TO ">"
VAR_REPLACE "synd.cfm/" WITH "http://www.whonamedit.com/synd.cfm/"
BODY = VAR
// Write article
WRITE_ARTICLE
END
//--------------------------------------------------
BEGIN_PROCESS SetTitleSynd
//Category information
PRIMARY_CATEGORY="Syndrome"
END
//--------------------------------------------------
BEGIN_PROCESS SetTitleDoc
//Category information
PRIMARY_CATEGORY="Doctor"
END
//--------------------------------------------------
BEGIN_PROCESS SetAuthor1
//Set Author
VAR = BODY
VAR_REMOVE_FROM VAR_START TO "thank"
VAR_REMOVE_FROM ", for" TO VAR_END
VAR_REMOVE_FROM "for" TO VAR_END
AUTHOR = VAR
END