Cloudera acquires big data encryption specialist Gazzang

Hadoop software company Cloudera has acquired Gazzang, a startup specializing in encryption software for big data environments. It’s Cloudera’s first significant acquisition (it bought machine learning startup Myrrix in 2013 in more of an “acqui-hire” situation) and it speaks to the importance of security as customers’ Hadoop deployments grow in scale and mature into production environments. The deal comes less than a month after Cloudera competitor Hortonworks acquired a security startup called XA Secure.

Gazzang’s technology includes a product for encrypting data stored in Hadoop environments and another for managing who can access the keys, tokens and other protocols that grant access to the data. “We will do the encryption– you know, scramble the data — and then we allow you to access controls,” David Tishgart, the company’s director of marketing and alliances, explained.

More interesting, however, is that Gazzang’s technology also works with a variety of other next-generation data stores, including Cassandra (Apache and DataStax), MongoDB, Couchbase, Amazon(s AMZN) Elastic MapReduce and Pivotal’s Hadoop distribution. As service-oriented architectures pick up steam, more applications will be attempting to access the same data stores, and as Cloudera continues to position itself as an “enterprise data hub,” it expects to be at the core of those environments, Cloudera senior director of product marketing Clarke Patterson explained.

“[Our security story] has got to be such that…it’s got the measurements built in and the mechanisms to protect against wrongful access,” he said.

Cloudera already has numerous security measures in place within its software. Some, such as Kerberos for managing who can access a Hadoop cluster, are baked into the various open source technologies that comprise its Hadoop distribution. Cloudera has spearheaded others itself, such as the Apache Sentry project for managing who (or what) can access data and metadata stored within Hive and Impala environments.

The Gazzang architecture.

The Gazzang architecture.

The Gazzang technology will also help advance a chip-level encryption initiative, called Project Rhino, that Cloudera was working on as a result of its tight partnership with Intel. The Gazzang headquarters in Austin, Texas, will become a research hub called the Cloudera Center for Security Excellence.

This being the Hadoop industry, though, no acquisition can be completed without addressing the open source elephant in the room. Apache Hadoop is and always has been an open source project, although the various Hadoop vendors’ strategies around intellectual property vary as they develop new technologies to work alongside it. Hortonworks, for example, said it plans to open source the entirety of the code it inherited as part of its XA Secure acquisition.

Initially, the Gazzang products will be incorporated into the Cloudera Navigator suite for Enterprise Data Hub customers, and available to be licensed separately by users of Cloudera’s free Hadoop software. However, Patterson noted, Cloudera is committed to not locking its customers into its technologies, and will consider the Gazzang products under this light as it integrates them more fully.

Tishgart noted that one of the first technologies to emerge from the new security research team will be a high-performance encryption engine to the Hadoop Distributed File System that Gazzang was already working on, and he expects that to be open sourced by the end of the year.

Details of the acquisition were undisclosed. According to Crunchbase, Gazzang had raised $9.6 million in venture capital since it was founded in 2010.

Feature image courtesy of Shutterstock user Maksim Kabakou.