How ‘systems thinking’ is making the cloud transparent

Given my current obsession with understanding everything I can about how cloud computing is beginning to look, feel and behave like a variety of other complex adaptive systems, I’ve started paying close attention to the widespread practice (outside of IT, it seems) of systems thinking.
Defined in Wikipedia as “the process of understanding how things influence one another within a whole,” systems thinking represents a modeling, analysis and design discipline that carefully explores “macro” aspects of highly interdependent systems. Systems thinking is heavily utilized in such fields as the social sciences, organizational dynamics, and industrial engineering to evaluate, model, and/or design how systems are composed and how they behave.
Systems thinking is difficult for those that have been educated to always apply reductionist thinking to problem solving. The idea in systems thinking is not to drill down to a root cause or a fundamental principle, but instead to continuously expand your knowledge about the system as a whole.
The problem of cloud boundaries
It’s one of the fascinating questions that faces anyone trying to model a system: What are the system’s boundaries? When everything is so highly interdependent (economies are linked to governments are linked to societies are linked to individual people, etc), how do you know where to start modeling, and where to stop?
In her classic book on systems thinking, the late Donella H. Meadows articulated brilliantly the challenge that “systems thinkers” are faced when scoping the problem they need to address:

“The lesson of boundaries is hard even for systems thinkers to get. There is no single, legitimate boundary to draw around a system. We have to invent boundaries for clarity and sanity; and boundaries can produce problems when we forget that we’ve artificially created them…
…There are no separate systems. The world is a continuum. Where to draw a boundary around a system depends on the purpose of the discussion—the questions we want to ask.”

She goes on to say:

“The right boundary for thinking about a problem rarely coincides with the boundary of an academic discipline, or with a political boundary. Rivers make handy borders between countries, but the worst possible borders for managing the quantity and quality of the water. Air is worse than water in its insistence on crossing political borders. National boundaries mean nothing when it comes to ozone depletion in the stratosphere, or greenhouse gases in the atmosphere, or ocean dumping.”

This, I think, is a critical observation for people building large scale cloud-computing applications and services that integrate with other applications and services in the cloud. Understanding where the boundaries of source code and data models lie is relatively straightforward, but understanding the boundaries of operations — monitoring, compliance, decision making, liability and so on in cloud-based applications — is not so straightforward.
The nature of cloud systems boundaries
In fact, I would argue that the nature of cloud-systems boundaries will themselves be highly dynamic, in part because of the comings and goings of technologies and services (not to mention politics, economics, and so on). However, it is also true that it will take time to discover the full extent of those systems for each application or service you operate, as everything is so interconnected.
This is different from what we experienced with so-called “traditional IT”, as we could typically maintain control of all but a few elements of our application systems, and the applications were generally quite isolated. We strived for stability, and that included stable boundaries. It is clear to me that this is becoming increasingly impossible.
There is also an interesting corollary to the problem of boundaries that must be considered when planning any application or service that might be consumed by outside parties. If you do not necessarily know which third-party services affect your “system”, it stands to reason that you also do not know which external systems or applications your offering affects.
In other words, how do you know the application systems that you may ultimately impact if anyone can consume your service at any time? Are you making it easy for them to design “around” you for their own resilience?
They key is transparency
All of this leads me to what I think is the key conclusion that has to be reached about the future architecture of our shared cloud computing “system”: transparency is essential. Without a steady stream of feedback data from whatever sources we determine — over time — have a significant impact on the operation of our applications, we are doomed to be unable to properly find the right “boundaries” for those applications.
Information about the functioning state of infrastructure (like compute nodes and networks), services (like data stores and platform services) or even other applications (like SaaS or your partners’ applications) will be critical to evolving the automation that successfully enables resiliency. And, as I noted in my keynote at last month’s excellent Gluecon in Colorado, one key goal in these systems is resiliency.
Will such transparency happen? I believe it already is. Just this week, Amazon Web Services (s amzn) announced a method for downloading billing information for their services. At one point in time, it was postulated that Amazon would never do this. However, customers have spoken, and the need to access real time costs of Amazon’s services programmatically has forced transparency.
Regardless, it is important to start thinking about your applications and services in the cloud as systems, not just stand-alone components. The challenge before you is to determine what the boundaries of those systems are, and how to design, build and operate your software to thrive within those boundaries.