Big Apple – the new Big Data Central?

When it comes to New York’s booming tech startup scene, digital media and new commerce companies get all the attention, but in reality, the Big Apple’s big opportunity might well be as the hub for “data-centric” businesses. In taking advantage of this opportunity, the city wouldn’t be going too far from its traditional strengths.

I was reminded of New York’s role in the emerging big data ecosystem over the weekend, when I was catching up with my weekly reading.

  • eBay (s EBAY), which had acquired Hunch in November 2011, is now building out its operations in New York with a new 35,000 square-foot office, Ryan Kim reported last week. “When fully built out, a majority of the workers will be developers, data scientists and statisticians,” he added.
  • Microsoft (s MSFT) has hired a bunch of former Yahoo (s YHOO) researchers in New York.

Steve Lohr of The New York Times wrote about Microsoft:

The group’s research focuses in large part on the application of advanced computing tools to the social sciences. It is a fast-growing field fueled by the vast new data sources of the Web, social-network communications and from sensor-equipped devices like smartphones. The potential is enormous, as Google and Facebook prove. But Microsoft has trailed so far.

Beyond these two recent announcements, there are several other startups that are experimenting with data. Take URL shortener,, for example. The service, which is based in Manhattan, is creating new news reader experiences based on the data it collects and from the social context that data carries. Businessweek writes:

This year, Bitly is introducing a suite of data products for professionals developed in part by Mason and her team of six scientists and engineers. One, dubbed Bitly Realtime, tracks terms that receive sudden bursts of attention. Another is a reputation-monitoring system. The goal of the products is “to give people a Spidey sense about what’s going on on the Internet that’s relevant to them,” says Mason.

Hilary Mason, who is the chief scientist at told me that in “New York we are more interested in telling stories” from of data as opposed to how “big” the data is, or what database technology you happen to use, Mason told us. “This is how businesses, marketers, and social scientists need to think about data to make rational decisions.”

New York has a long history of learning from data, thanks to the quantitive revolution that swept Wall Street. Financial services, was the first real big data vertical, and quants were Wall Street’s data scientists. The markets for tradable instruments, high-volume and high-velocity data streams all came from Wall Street. (Actually, that was the rationale behind why we host our Structure: Data conference in New York.)

Mason believes that New York can leverage big data to its advantage. From art to fashion to media, New York has enough creative talent to be able to ask the right questions from the data. A good example is the Explore feature on Foursquare, which co-founder Dennis Crowley calls the “big data driven recommendation engine for the real world.” (Here is a presentation of technology behind Explore that is pretty cool.)

Similar startups such as are helping make sense of social data as well. For instance, the startup sucks in your social data across platforms and let you ask questions like “Who do I know who likes cheeseburgers in Paris?”

“There are a bunch of interesting things happening in NYC, not all of which are startups,” Mason says, singling out Wes McKinney’s the Pandas Project, which is a time-series analysis system drawn from his experience in finance. Mason calls it “awesome.”