Gigaom Research webinar: Apache Hadoop: Is one cluster enough?

When we’re talking about conventional IT systems, we rarely question the idea of geo-distributed systems and redundancy. And we don’t usually challenge the notion that load balancing among servers and farms is a smart thing to do. So why don’t we routinely think this way about Hadoop?

Customers can set up multiple Hadoop clusters and use each one for a different workload. Companies can then site these clusters in different geographies, for redundancy, load balancing and/or content distribution. The data can be segregated or, using replication technology, it can be synchronized between sites to create a “logical data lake.” Is utilizing multiple Hadoop clusters in this way is folly, or is it just pragmatism?

In this webinar, our panel will discuss:

  • Does Apache YARN make all tasks equal or does dedicating clusters to specific workloads make more sense?
  • Is the data lake concept best for all, or is partitioning data between clusters right for some customers?
  • Can Hadoop inter-cluster replication of data work?
  • How do public and private cloud architectures impact the multi-cluster question?
  • Can multiple clusters be a vector of parallelism and elasticity?

Speakers include:

  • David S. Linthicum, SVP, Cloud Technology Partners
  • Paul Miller, Founder, The Cloud of Data
  • Lynn Langit, Founder & Consultant, Lynn Lancet
  • Randy DeFauw, Senior Product Manager, WANdisco

Register here to join Gigaom Research and our sponsor WANdisco for “Apache Hadoop: Is one cluster enough?” a free analyst webinar on Wednesday, October 15, 2014 at 10 a.m. PT.