Quote:
Originally Posted by Chaos
Just be sure to use the caching options, as most websites don't like having their RSS feeds hit more than once every 30 minutes (Slashdot, in particular, is very strict about this.)
|
Ironic you should mention that... I'm writing a paper that is titled "Why RSS Sucks", and is specifically targeted at the 99% of users and feed readers that ignore the entire point of RSS.. to syndicate news. The paper goes through the various types of RSS (there are 7 formats), and how they're delivered (details on each header, etc.) and how 99% of the readers out there completely misbehave when revisiting feeds.
We are seeing roughly 800 separate readers and agents hitting our rss feeds
every hour even though the content on those feeds hasn't changed in
weeks. Not only does the "Expires:" header specify a 2-week revisit interval but the ETag hasn't changed at all in months for some of the stale news articles.
And yet these readers still pound the feed almost every hour. I've just started blocking them en-masse. If they can't learn to use the standards, they can use them on someone else's pipe.
Code:
# iptables-save | grep "dport 80" | wc -l
622
(Some of these are entire /24 CIDRs)