As the firehose matures, Twitter tightens grip on valuable asset

This summer, Twitter showed that it had no problems playing favorites when it came to developers using its API to build a business. So when it comes to the company’s firehose — the unfiltered deluge of 1 billion tweets every two and half days — it’s becoming apparent the company will continue to play favorites there as well, limiting who has access to one of the company’s most valuable assets.

Twitter is currently being sued by PeopleBrowsr, a company that has had full access to Twitter’s firehose, after Twitter said it would restrict PeopleBrowsr’s access to the firehose following the expiration of a previous contract. From Twitter’s perspective, it makes sense to consolidate who is accessing and re-distributing that data due to the scope of the data and work involved in transmitting those tweets. But it hasn’t always been this way, and PeopleBrowsr is challenging Twitter’s earlier commitment to openness when it comes to that firehose of data.

In 2010, Twitter licensed its full firehose to both big guys like Bing, Microsoft, and Google, as well as several startups focused on realtime search, with then-Twitter spokesman Sean Garrett explaining that they wanted to build partnerships that were “sustainable and scalable.”

With access to the full Firehose of data, it is possible to move far beyond the Twitter experiences we know today. In fact, we’re pretty sure that some amazing innovation is possible. Today, we’re happily turning the Firehose on for some new partners focused mainly on exploring the incredibly rich field of real-time search and discovery. We are thrilled to announce that EllerdaleCollectaKosmixScooplertwazzupCrowdEye, and Chainn Search join us as partners. These companies range from funded startups to part-time, one-person operations so we came up with a fair way to license access that scales with their business. If you think there may be a potential partnership involving access to the Firehose, let’s start a conversation.

Clearly, that conversation about access to the firehose has changed since 2010. A highly-public partnership with Google broke down, killing Google’s realtime search product that relied on the firehose of data. Twitter and Bing re-negotiated their terms in 2010, with AllThingsD reporting at the time that Twitter was asking about $30 million for the access. And most of the startups who recieved access in 2010 are no longer powering search engines: Ellerdale was acquired by Flipboard in July 2010, Collecta has removed their “customer-facing site,” Kosmix was acquired by WalMart in April 2011, Scoopler shut down after the team was acquired by Google to work on Google+ in 2011, and CrowdEye shut down after the founders said they were unable to build a profitable business.

So who’s still has access and distribution rights to the firehose? The full list of companies with commercial partnerships isn’t publicly available, but there are a few like Bing and Salesforce that have publicized this relationship, and several companies listed as part of Twitter’s certified program do have firehose access and can provide enterprise access to that data for other companies who are interested. Duncan Greatwood, CEO of Topsy, which is one of the companies licensed to re-distribute the data, said that including Topsy’s archives, there are now more than 250 billion tweets. That makes it unrealistic, in his opinion, that many companies would be prepared to access the full stream — most companies only access smaller percentages.

But PeopleBrowsr CEO Jodee Rich said his company relies on full access to the firehose to serve its customers, and an agreement with one of the other providers wouldn’t make sense.

“The nature of those agreements are so short-term that no one could possibly build a viable business on those agreements,” he said in an interview. In Rich’s legal statement, which is available online here, he notes that PeopleBrowsr was paying Twitter more than $1 million per year for firehose access.

Topsy, Gnip and DataSift are three providers who license the full firehose of data from Twitter and then re-sell portions of it to either developers building products with the data (showing your influence on Twitter, for instance), or to marketers and brands using sections of tweets to track conversations and trends on the network. PeopleBrowsr is arguing in the court documents that being forced to get data from one of these companies would compromise its ability to serve its clients, since it previously had full access. Twitter counters that it’s totally within its rights to change the terms given that its initial contract with PeopleBrowsr has run its course:

But as Twitter has grown, its contracting practices have matured. Where it once contracted directly with just a handful of data customers like PeopleBrowsr, it now has hundreds of data customers. In order to handle that broad commercial demand in a consistent and transparent manner Twitter has created channel resyndication partnerships with Gnip, DataSift, and Topsy. PeopleBrowsr is free to contract with any of them, just as its competitors do. What it is not free to do is insist that Twitter preserve forever its earlier business model, or continue to be bound by a contract that expired more than a year ago.

A Twitter spokesperson replied that, “We believe the case is without merit and will vigorously defend against it.”

PeopleBrowsr has won a restraining order that forces Twitter to continue firehose access for the time being. As we’ve learned this year, building your company on top of another company’s data is so inherently risky. But the lawsuit does demonstrate the value Twitter’s data has grown to hold, and the lengths companies will go to secure it.