Skype Tells Us What Happened

[qi:90] Skype’s Heartbeat Blog has an explanation for the 30-hour outage that plagued the eBay-owned (EBAY) voice company last week. A quick overview:

1. Microsoft issued Windows updates on Thursday, Aug. 16th.
2. Millions installed those patches, rebooted, and tried to log into the Skype network — pretty much all at the same time.
3. Combined with a lack of P2P resources, the flood of log-in requests put the Skype network under extreme stress.
4. This, in turn, exposed an unseen software bug “within the network resource allocation algorithm which prevented the self-healing function from working quickly.”

OK, it sounds credible — but do you buy it? Skype Journal has some questions, namely if the bug’s fix has been propagated. What, they ask, is preventing this from happening again? After all, Microsoft (MSFT) routinely issues patches. Borough Turner, chief technology officer of NSM Communications, alludes to this in his most recent post.

Experts have pointed out that Skype generates a lot of traffic between log-in servers and supernodes. Maybe the supernodes went down during the “patches” as well. Someone who seems to be familiar with the Skype network architecture left a comment earlier that explains this relationship between 50-odd authentication servers and supernodes and also a weak link.” (Full explanation is here.)