![]() |
#1 |
Connoisseur
![]() Posts: 62
Karma: 72
Join Date: Oct 2002
Location: Germany
Device: nook
|
universal RSS-feed creator
Hi all,
I just found http://www.wotzwot.com/ It converts a site to a RSS feed - great! ![]() My long missing site http://www.freewarepalm.com/moresoftware.shtml converts to: wotzwot -stobs. |
![]() |
![]() |
![]() |
#2 |
Is papyrophobic!
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,926
Karma: 1009999
Join Date: Aug 2003
Location: USA
Device: Dell Axim
|
So wotzwot is something like a universal page scraper? My biggest concern is how they handle caching issues. Are scraped sites being cached? Or are they already re-scraped when someone polls the feed? I am sure some webmasters would get seriously pissed if suddenly someone is constantly scraping his sites.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
Connoisseur
![]() Posts: 62
Karma: 72
Join Date: Oct 2002
Location: Germany
Device: nook
|
I think there is no reason to cache the content, You define the rules and they fetch the content as you request them to do.
I like to invite to collect some sites here. My ones: German BSI http://www.wotzwot.com/rssxl.php?pag...=%3C%2Fspan%3E German Plock Magazine (Golf-Sport) http://www.wotzwot.com/rssxl.php?pag...ble%3E&sd=&ed= -Stobs. Quote:
|
|
![]() |
![]() |
![]() |
#4 |
Fully Converged
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 18,171
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
|
But imagine someone requesting the feed every 5 mins. I had a few users who were polling the PalmGear feeds from our site every 5 seconds (!!!). Even worse, their feed client didn't support conditional gets, so it fetched the entire feed each time. Nor did it support gzipping. So every 5 seconds the user polled both PalmGear feeds which are each around 20k uncompressed. In other words, every minute he plugged almost half a Mb of bandwidth from us.
This is I think Colin's major fear with scraping sites like wotzwot. Do they re-scrape the target site every time a feed is polled (perhaps by several people even), or do you they localled cache the reformatted feeds. When I have some more time, I will give it a try and use it on some MobileRead resource, so that i can check our log files later to see how wotzwot handles this issue. |
![]() |
![]() |
![]() |
#5 |
Connoisseur
![]() Posts: 62
Karma: 72
Join Date: Oct 2002
Location: Germany
Device: nook
|
At least for now they don't cache. I tried to fetch a google-news search with wotzwot.com:
google news It provided me new links every some minutes. That could be change of course. -Stobs. |
![]() |
![]() |
Advert | |
|
![]() |
#6 | |
Fully Converged
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 18,171
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
|
Quote:
|
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
RSS Feed | timezone | Feedback | 8 | 01-02-2010 06:55 PM |
RSS Feed Prob... | AKninja04 | Calibre | 6 | 08-25-2008 07:51 PM |
RSS Feed Updates | Alexander Turcic | Announcements | 0 | 06-11-2004 04:11 PM |