RSS, Tiger Safari and the Bandwidth Bottleneck

In less than 48 hours many of us will be installing Tiger OS-X and with it a brand new Safari browser that can read and display RSS feeds in a simple easy to understand manner. That upgrade while great for the consumers, could come as a big shocker for those blogs whose feeds are included as part of Safari’s default starter package. Infact it could be the biggest stress test for RSS thus far!

Most RSS readers are set to poll for updates every hour, and imagine when half-a-million Tiger Safari users who start hitting a server at the same time, pulling down RSS updates, because they have not changed the default settings. Server meltdown? Or an unintended denial of service? Apple says that most of the default feeds are going to be major news sites like CNN. New York Times, and LA Times. At this time they are not including any personal blogs as part of the default list. Even for them it is not going to be easy.

Lets say if one of these news operations updated their site once an hour and each update results in a nominal 5 kilobytes of RSS generated data, then 500,000 simultaneous Safari users polling at top of the hour would mean a total data transfer of over 2 gigabytes per hour. Times 24, and you have over 48 gigabytes of data transfer every day – just from Safari users alone. What if more than a million Tiger Safaris were on the loose. Oh boy! While an addition 48 gigabytes of traffic a day or 1.4 terabyte a month is not that much for large sites, but it will add up.

Admittedly, since I don’t have Tiger yet, not sure if Safari RSS does time-based check (every hour at :15) or checks related to when the computer/browser is started, which is relatively random and what other feed readers do. Clearly this is an imaginary scenario, but it could happen. So what’s the fix? “I certainly hope that Safari does conditional GET. I can’t imagine it doesn’t but I could be wrong,” says Brent Simmons, founder and the man behind hit feed reader, Net News Wire, “With conditional GET you download the feed only if it’s different from the last time you downloaded it — this cuts way down on bandwidth use.” (More on conditional GET.) “Conditional GET — which NetNewsWire and most other aggregators support — is hugely important,” says Simmons. But even that can go that far, since most of these news operations churn out headlines with monotonous regularity.

Long term, I think RSS is going to become a clear bandwidth hog, unless the RSS people decide and come-up with an intelligent way to fix this problem. I have been tracking my own bandwidth consumption and RSS is just sucking up gigabytes like a parched man on a hot summer day. Some say that randomizing the whole RSS polling process is the answer.

How about randomizing the whole RSS polling process? Instead of pulling down RSS feeds every hour, let the feeds download randomly. Okay that will help distribute the loads on the servers more evenly, but that still doesn’t resolve the issue of inefficient use of network resources, especially for those who pay for those kind of things. Suggestions?

Scott Rafer, CEO of Feedster says, “ISPs can start caching feed URLs but if they do it with cached times of more than 10 min, then people will route around the caches.”