NoSQL is growing up, and DataStax just raised $106M to prove it

“For me personally, no, I didn’t see this coming,” DataStax Co-founder and CTO Jonathan Ellis admitted during a recent phone interview about $106 million series E round of venture capital the company announced on Thursday.

“I felt that we were at the right place at the right time in terms of distributed databases being a problem that the industry needed to solve, but I don’t know that I saw quite this level of success,” he continued. “My wife was telling after one of my recent trips, ‘When you started this, I thought I was signing up for four years of you traveling and being gone, and now you’re raising a series E.'”

But DataStax, which was founded in 2010 and sells a commercial version of the open source Apache Cassandra distributed database, is officially a big company. Its latest round of financing brings the company’s total equity investment to $190 million and values the company at $830 million. Kleiner Perkins Caufield & Byers led the round, which also included new investors ClearBridge, Cross Creek, Wasatch, PremjiInvest and Comcast Ventures. DataStax’s existing investors all participated as well and, as co-founder and Chief Customer Officer Matt Pfeil noted, Series A investor Lightspeed Venture Partners “really went above and beyond” what was required.

They’re all excited by DataStax’s impressive growth: an employee headcount that has doubled to 350 in 2014 and will hit 450 by the year’s end; a 500-company-plus customer base that includes 25 percent of the Fortune 100; and offices and engineers spread across the United States, Europe (where a large number of its Cassandra developers are based) and Asia. New investor PremjiInvest came on board to help the company expand into India.

Not bad for a company that grew out of a kumbaya moment about six years ago, when open source non-relational databases began popping out of the woodwork, all collectively unifying under the NoSQL banner. Cassandra itself came out of Facebook, which developed it for some specific applications but eventually began moving future services to a similar open source technology called HBase. For a while, there was a collective rallying cry that relational databases couldn’t scale and NoSQL — pick a project, any project — was the answer.

For a look back to the good, old days, check out Pfeil pitching Riptano (as DataStax was originally called) at our Structure 2010 launchpad competition.


Or watch this longer video interview with Pfeil and Gigaom’s Stacey Higginbotham, also from 2010. Or check out Ellis talking about Cassandra as part of the NoSQL Tapes series, a collection of interviews with NoSQL entrepreneurs and developers from 2010.

After those early days, then came the startups, the money, and the realities of selling into companies with IT budgets and legacy infrastructure rather than winning over web developers with open source code. Some NoSQL projects and companies fizzled out, others survived with varying levels of success. Now a proven set of technologies (often as complements to existing relational systems rather than replacements for them) new NoSQL technologies, projects and companies continue to emerge.

But the big three winners of that first batch are MongoDB, Couchbase and DataStax. MongoDB is by far the biggest in terms of name recognition, investment and sheer user count, but they’ve all raised more than $100 million and have secured some big-name customers. They’ve also all had to change their mindsets in order to grow like this, in terms of engineering for a different customer base and in sharpening their elbows for competition.

“If your dataset sits on one machine, it almost doesn’t matter what technology you use,” Ellis said.

Some of the other projects featured in the NoSQL Tapes.

Some of the other projects featured in the NoSQL Tapes.

At the high end of the market, though, where customers have big data and big budgets, technology does matter. But building a technology that can hold up at that scale in customers’ data centers without requiring them to employ a team of Facebook- or Google-level engineers means making some sacrifices. For DataStax, that meant, somewhat reluctantly and not always naturally, saying goodbye to its hacker roots and focusing on how to get its technology into enterprises.

Although that’s not to say DataStax has moved entirely away from the early adopter crowd. The company runs its own startup program that Ellis said includes more than 300 participants in Europe alone. And it’s still making some engineering bets that lean futuristic today and will take some time to make their way into the mainstream — such as the addition of support for Apache Spark, to improve the speed of analytic jobs, in the latest version of the DataStax Enterprise software.

“We’re kind of straddling two worlds right now,” Ellis explained. “We had our roots in kind of the Silicon Valley, early-adopter crowd, and we’re expanding rapidly in the more traditional enterprise market. I think the former of those is the one that’s like, ‘Oh, Spark is the greatest thing since sliced bread,’ and it hasn’t really registered quite yet with the less tech-centric customers.”

The latest DataStax marketecture diagram.

The latest DataStax marketecture diagram.

Despite its focus on being the biggest, fastest, most-scalable database around, though, the DataStax team knows that it can’t sacrifice usability entirely in the name of performance and scale. Ellis didn’t come right out and say it, but he did suggest that unlike in web startups or other engineering-centric companies that will run numerous databases for numerous different tasks, enterprise customers are often looking to settle on a relational database systems (such as Oracle or MySQL) and a NoSQL system. That might mean there’s not room for DataStax, Couchbase and MongoDB within the same company.

“On the one hand, they’re finding that Oracle is a really terrible way to scale to millions of users, but one the other hand they they don’t want to go full-on polyglot persistence because that’s a maintenance madhouse,” he said. “So really what they want to do is standardize on a relatively small toolset that they can use over and over. … Often that’s Oracle and Cassandra or MySQL and Cassandra. … Those two are general enough that you … might not necessarily need to add Redis into the mix, for instance.”

Competing primarily against, and often sitting side by side, a behemoth like Oracle seems long way off from when Ellis and Pfeil left Rackspace to start DataStax four years ago, Riptano moniker and rhinoceros logo in tow. Not that they’re complaining about the company’s good fortune.

“The hacker crowd using Node.js, we might not be the best fit for them,” Ellis said, “but I’m OK making that tradeoff in exchange for being a better tool for the Fortune 500 and the Fortune 1000.”