At one time, we thought single cloud deployments would be the standard. Today, complex hybrid or multi-clouds are more prevalent than single clouds. Indeed, there could be as many as a dozen clouds under management. Those charged with CloudOps need to learn how to place a layer of technology between themselves and complex cloud services, and learn fast.
CloudOps are gaining more importance as enterprises become more dependent upon public and private clouds. However, not much has emerged around best practices and best tools in this arena. Most who are charged with operating clouds have felt their way in the dark these last few years. That needs to change.
Continuous Operations means a 24-hour-a-day, 7-day-a-week operation. That’s the idea behind continuous operations: Operations never stop. You put hardware and software solutions in place to eliminate planned maintenance, and applications will continue to service customers until they’re switched over to newer versions once you’ve deployed and tested the new applications.
If you think this requires new approaches to platforms, you’re correct. Organizations that want to pursue continuous operations should expect to invest in many dual, redundant platforms. While the use of public cloud providers is a clear cost advantage in provisioning servers to support continuous operations, you’ll still pay for more resources than if you deployed on a single platform during a scheduled period of maintenance.
That said, the value of continuous operations can be largely determined by the cost of downtime. If core systems are unavailable for a certain period of time, either planned or unplanned, that downtime can cost the company as much as $1 million per hour, depending upon the business. As more companies move to 24-hour global operations, the cost will shift upward, making it easier to cost-justify continuous operations.
Importance of Cloud Metrics
Core to this goal is the gathering of cloud metrics, such as performance and transaction data, as they occur. This information is valuable for a few reasons:
- You can trend the data, and spot issues with the near past operations. This would include read/write errors from your private cloud storage, which could be an indication that a drive is about to fail and needs to be replaced.
- You can use the data to provide predictive analytics. This would include determining when the demands on servers will require that you provision more servers. This allows you to operate in line with capacity demands, which allows you to be more cost effective with private and public cloud resources.
- You can set up processes that make your systems in the clouds self-healing. Keep in mind that cloud providers offer a few auto-fixes for common problems within the infrastructure. However, for the most part, you’re in charge of the application layer. So, setting thresholds via policies that react to certain conditions are possible when you gather operational metrics. That means you won’t get calls in the middle of the night because applications have stopped due to a database issue, or connectivity problem.
It’s time we start thinking about what cloud metrics mean for quality cloud operations. CloudOps exceed traditional on premises operational practices, in terms of up-time. While the focus will be on dealing with operational metrics, the business will measure the value of clouds by the number of service disruptions. The objectives are to prevent bad things from happening, and promote continuous operations of new cloud-based systems. The effective use of cloud metrics can help you reach those goals.
So, if you take continuous operations, add metrics, and make sure that you create solutions that can never stop operating, then you have the foundations of CloudOps. While it is a new buzzword, the value behind it is that we’re thinking through how to create the best quality operations for private and public clouds that meet or exceed operations and performance requirements.
The challenge to CloudOps is that we’ve yet to define best practices and tooling, and that will make a difference. When it comes to technology and processes that will achieve the continuous operations objectives for cloud computing, for now, it’s still a work-in-process.
This post was written as part of the Dell Insight Partners program, which provides news and analysis about the evolving world of tech. For more on these topics, visit Dell’s thought leadership site Power More. Dell sponsored this article, but the opinions are my own and don’t necessarily represent Dell’s positions or strategies.