Welcome to the bespoke data center.
In an effort to streamline functions and cost, enterprise data centers are taking a page from the web-scale companies and building out data centers customized with small chunks allocated to different workloads. This infrastructure is called “core and pod,” and it’s growing in popularity.
In this type of architecture, data center operators figure out the best configuration of common rack gear — servers, storage, networking equipment, etc. — to suit the needs of each of their applications needs. These custom configurations are referred to as pods (some people call them clusters), and they are connected to the network core that distributes data and network traffic to customers.
By breaking down the entirety of a data center into groups of pods, it’s easer for companies to have a better idea of how all their gear fits together as a whole. Companies also get the benefit of being able to simply build out more pods if they find they need to expand their infrastructure.
How pod and core configurations boost performance
[company]Google[/company], [company]Facebook[/company] and [company]eBay[/company] are all examples of major tech companies that have been using the pod and core setup, said Dave Ohara, a Gigaom Research analyst and founder of GreenM3. With massive data centers that need to serve millions of users on a daily basis, it’s important for these companies to have the ability to easily scale their data centers with user demand.
Using software connected to the customized hardware, companies can now program their gear to take on specific tasks that the gear was not originally manufactured for, like analyzing data with Hadoop, ensuring that resources are optimized for the job at hand and not wasted, Ohara said.
It used to be that the different departments within a company — such as the business unit or the web services unit — directed the purchases of rack gear as opposed to a centralized data center team that can manage the entirety of a company’s infrastructure.
Because each department may have had different requirements from each other, the data center ended up being much more bloated than it should have been and resulted in what Ohara referred to as “stranded compute and storage all over the place.”
“You need to have some group that is looking at the bigger picture,” Ohara said. “A group that looks at what is acquired and makes it work all together.”
Fastly’s take on software defined networking using pods
Consider the content delivery network (CDN) startup [company]Fastly[/company]. For Fastly to be able to stream content in the quickest possible way while also ensuring that new servers come online to replace failed ones, the company had to design its own custom architecture, which spans 17 point-of-presence (POP) locations — the places where networks come together — across the globe.
Fastly’s pod setup consists of two racks comprised of servers and switches, the exact number of which is determined by the POP location and how much traffic occurs in each area, said Tom Daly, Fastly’s vice president of infrastructure. If the company needs to add more resources to a particular POP or a major anchor site that receives a ton of traffic, like Fastly’s San Jose, California, data center, it can simply drop in another two-rack pod and connect it to the rest of the system.
Because of improvements in technology over the past ten years, Fastly doesn’t need a bunch of excess hardware like load balancing or router equipment to handle the task of quickly distributing information throughout the network in a reliable manner, explained Fastly VP of technology Hooman Beheshti.
He remembers a time when companies were at the mercy of whatever software came pre-stocked on hardware devices like a load balancer appliance. With the advent of software-defined networking techniques, however, Fastly can now customize the software running inside its [company]Arista[/company] switches and reprogram them to take on more tasks than what they were designed for. Now, Fastly’s switches and servers can do load balancing and can intelligently allocate traffic across the network.
“We took these machines, simplified everything, and wrote code and focused them toward our needs specifically,” said Beheshti.
Working with companies to build their data centers
At [company]Redapt[/company], a company that helps organizations configure and build out their data centers, the emergence of the pod and core setup has come out of the technical challenges companies like those in the telecommunications industry face when having to expand their data centers, said Senior Vice President of Cloud Solutions Jeff Dickey.
By having the basic ingredients of a pod figured out per your organization’s needs, it’s just a matter of hooking together more pods in case you need to scale out, and now you aren’t stuck with an excess amount of equipment you don’t need, explained Dickey.
Redapt consults with customers on what they are trying to achieve in their data center and handles the legwork involved, such as ordering equipment from hardware vendors and setting up the gear into customized racks. Dickey said that Redapt typically uses an eight-rack-pod configuration to power up application workloads of a similar nature (like multiple data-processing tasks).
With this setup, it’s easy to keep track of the various pod configurations that a company might use so that if one were to discover it needs more compute for better performance, the company can buy another pod tailored to handling powerful processing as opposed to a pod that might be better suited for storage-related tasks.
“I’m trying to say, ‘Hey this is what works; let’s not spend the next eight months customizing Frankencloud,’” said Dickey.
How Egnyte breaks down its data centers into pods
For file-sharing startup Egnyte’s three data centers, the company uses four pods that each contain four racks that handle compute, storage and networking, said Kris Lahiri, Egnyte’s VP of operations and chief security officer. The startup also has an additional two pods in each data center with which it can try out more advanced features or do quality assurance checks and testing.
[company]Egnyte[/company] runs an algorithm to distribute its customers’ data across the pods and has a dashboard that logs the performance for each pod so it can tell whether or not one is spiking or suffering from an outage. This is important for the startup, because its clients can be dropping in terabytes of data into Egnyte’s system without notice and by having all the pods hooked up together and monitored, Egnyte can react to sudden changes.
“Instead of looking at the entire gamut of data centers, you break it down by a data center and then you break it down by a pod,” said Lahiri.
Egnyte configured the compute section of each of its pods to take on heavy processing tasks, like encryption, indexing with Elasticsearch and setting up user permissions and access details. While setting up basic administrative controls for user data may seem like a relatively easy task, doing so actually requires a lot of processing power when one wants to apply the same internal IT configurations to multiple clouds, explained Lahiri.
In this case, Egnyte was able to tie in the open source Puppet (see disclosure) configuration-management tool to its data center so that the same administration permissions a company wants to carry over from its internal IT systems will be carried over to the Egnyte cloud.
Building your own Amazon cloud
Beyond improving Egnyte’s data center efficiency, Lahiri said the new way of building data centers has prevented vendor lock-in. While the startup currently uses a mix of Dell servers and Supermicro hardware for storage, if one day it decides to use different gear, it’s going to be a lot easier to swap out appliances than in the past, because Egnyte can just reprogram the gear and have the IT management software handle everything in the same way.
Fastly’s Daly shares a similar sentiment and said that having the ability to buy custom hardware, modify it with software and then place that hardware into pods provides a much better method of running a data center, because you are no longer relegated to having to use the same gear in the exact same way as a company in a different industry might.
Of course, building your own pod and core setup isn’t cheap. According to Dickey, Redapt can charge anywhere from $200,000 to $600,000 for a fully loaded rack; multiply that by the typical eight-pod configuration and you’re seeing a bill that’s in the millions of dollars.
But for companies that want full control of how their infrastructure functions, using a pod and core setup powered by software is worth the effort.
“This is essentially like doing an Amazon cloud yourself,” said Lahiri. “Amazon is, of course, doing it at a much larger scale, but technically you are doing the same thing.”
Disclosure: Puppet Labs is backed by True Ventures, a venture capital firm that is an investor in the parent company of this blog, Gigaom.
Post and thumbnail images courtesy of Shutterstock user watcharakun.