Do we need internet exchanges for public cloud data portability?

Session Name: Creating Data Sharing Ecosystems.
Speakers: S1 Announcer S2 Barb Darrow S3 William Gerhardt S4 Robert Jenkins S5 Audience Member 1 S6 Audience Member 2
Thank you so much, Phil. Up next, I want to bring my colleague, Barb Darrow, back to the stage. Shes going to be interviewing Bill Gerhardt, who is the Director, Service Provider Practice, Internet Business Solutions Group at Cisco Systems, and Robert Jenkins, Co-Founder and CEO of CloudSigma. Theyre going to be discussing, Creating Data Sharing Ecosystems. Please welcome our next talk to the stage.
Hi, guys. Were in the home stretch, here. Im Barb Darrow. Im a senior writer with GigaOM. Ill have my panelists introduce themselves, and then well get started. We really would love questions, so dont be shy about it.
Hi, my name is Bill Gerhardt. Im with Cisco Systems. Im in a group within Cisco called the Internet Business Solutions Group. What that stands for is basically, our strategy organization that works with our top tier-1 customers, trying to figure out where the intersection is between technology and new business models.
Weve been tracking, of course, this whole move around big data and analytics and what it means in terms of using the cloud and using the network. One of the thing we kind of stumbled on to that well talk about a little more, is that the network, whether its an enterprise or a public network, it provides a foundation for not only the collection of data, but also taking action on that data in a much more efficient manner.
One of the things that were focused on now is trying to evolve the industry from what we consider to be the early days, what we define as Wave One, in terms of where we are. And then, try to move us to where we want to go, which is out in this Wave Three world. We think the network plays this role of getting us through the Waves: from Wave One to Wave Two to Wave Three, where analytics become much more embraced, holistic across the organization. Well talk a little bit more about that.
Hi, my name is Robert Jenkins. Im the CEO and Co-Founder at CloudSigma. Were a public infrastructures service provider. And really, Id say the area of my work thats most relevant for this panel, is our work with big institutions and big data private customers, who are moving these large-scale datasets into the cloud. Its really the work that we do as a public infrastructures service provider, to facilitate those datasets, both from a security perspective but also, unleashing the value thats in those datasets. I personally work with people at the European Space Agency, Sun, and private big data companies who are using public cloud and moving those private datasets into the cloud, and making them accessible to the right partners that they have, in a secure and effective way.
My first question has to do with data portability. In the old world, a lot of IT folks were afraid of vendor lock-in. They might not want to put all of their eggs in one vendor basket. Now with the cloud, were seeing companies like Google, Amazon, Microsoft, basically offer free data storage. There was a price war with the prices going down. They were trying to one up each other, making it very easy and cheap for you to put data in their clouds. The question is, is there concern among users that once the data is there, its kind of like the roach motel, you can check in but you cant check out. Is it as easy to move the data off those clouds, and if not, does that have to be settled? Id love you guys to both talk about that.
Let me try to attack from an importance perspective first. Then, we can try to figure out the best way to try and handle it. As we talk about those waves again, and we talk about what I call the data plain. Today, in most organizations, whether youre a public sector service provider in the enterprise, were finding that data is very siloed. So, threat to organization, its not being openly shared with the users that could actually benefit from it. One of the things that we see in moving from Wave One to Wave Two, and were seeing an early adoption of this, is that you get to the point where you have a common view, a common identity or common shared practices across a common data warehouse. That becomes Wave Two.
By Wave Three, which is really where I think a lot of the euphoria around some of the use cases and opportunities are that we all talk about, day in and day out, reading in the magazines, is that we get to the point where we have a full ecosystem of data thats being shared. And there is a model in which theres governance thats allowing for that data to move freely across that ecosystem, so that it can be applied where it needs to, to get the most value. If you do that in a manner– obviously theres going to be a question on how do I port my data if put it into one place. I think at that point, once you have the clearing house concept, the federation, the ecosystem in place, then the rules are going to be defined that are going to tell us how do share it? It wont work otherwise. So, if want to get to where we need to be, out in Wave Three, its going to be paramount that we figure out that governance model that allows all to do that, freely, transparently, and obviously, in a private, secure way.
I would say data portability is obviously critical, and its always a double-edged sword. Theres a trade off because if you use standardsso you have innovation and then you have standards. Facebook innovated, so there is no standard for what theyre doing. Therefore, you have a sort of lock-in, because Facebook has their own proprietary systems. Not to pick on Facebook, because any innovative company is going to have new ways of doing this. If youre talking about data, its a similar thing. You can use a service that may be unique and you get value from that uniqueness, but of course then, the flip side of that is that there isnt someone else that provides that same service. Its a trade off that each customer has to make about what they want to do and which areas they want specifically to be more portable than others. Saying that, I think the public cloud providers and people like this at the infrastructure layer, which isnt necessarily proprietary. And when youre talking about a drive image, I think there are pretty standard drive image formats that we can all agree to and have done. They should make it portable, and some of the things you see with vendors, even if theyre using the same cloud stack like open stack, for example, you cant actually move the drive between two, unnamed open stack clouds, for example.
Why is that? Why is that infrastructure service provider not exposing that ability, even though theyre using the technology? Thats obviously a vendor problem, and its not actually a technology problem.
Thats kind of my point, the vendor behavior doesnt necessarily change. Youre talking about standards. I would love you guys to take a guess about when the standards will be in place, and you can move even between public clouds, move your data trove fairly easily without too much friction. Is that two years away, four years away?
I think theres a governing force thats going to either accelerate it or slow it. I think at the end of the day, a lot of the data that were collecting out there, probably 60+%, is really consumer driven. Today, the consumer is unaware of how their data is being collected, used, and so theyre a little nervous about that. Our ability to openly share and federate that data and allow it be ported, is being held back. We call that the Wave One of our consumer plain. By Wave Two, we start to see where theres more opting in and opting out, and allowing that consumer to actually choose when they see value and whey they dont, and use their data for that. Eventually, we get to the point in Wave Three, where we actually see where theres an open marketplace for that data, very transparent so that the consumer actually sees value, gets value, and they get reimbursed either with a better proposition from the provider, or some better experience.
At that point, suddenly, their data is very valuable. Theyre going to demand that the ecosystem share it in a way that they want it shared, and also thats not
In a way that benefits them.
benefits them. They cant have it stuck in one place, so all of these different dimensions around those waves move in harmony. If you dont get one of them to work, if the consumer is not part of that process, I dont think that theyll beit will be a major problem around portability.
You dont have real federation between public clouds at the moment. You have abstraction layers. People build trading platforms, things like this, but theres no real, at the infrastructure layer, big pipes specifically between cloud vendor A and cloud vendor B, thats designed to move people between the two. Actually, this is a problem that was solved in the Tel Cos through Internet exchanges. Its been done before, so they obviously built out their networks and then they have the problem of people wanting to use different networks, so they had to talk to each other. So then, you have Internet exchanges and things like this. I think you will see the development of similar things, either using existing Internet exchanges or other similar products. For example, we see this with data center providers starting to look at this, an understanding, and this comes into hybrid, understanding that customers want to use public cloud. So they want to make public clouds accessible to their customers and also between public clouds.
The data centers are doing a lot of interesting work essentially connecting those different participants between each other.
Having their productsfor example, Equinix, in particular, has an IBX product where all their customers can patch to this exchange, and they can patch to any of the other customers in that metropolitan area, not just in that particular data center. Of course, those customers include public clouds. You could then, using the IBX product in Amsterdam for example, you could move data between Microsoft Azure and GoGrid both located in CloudSigma. And that all exists on the backbone of Equinix. Its coming, but there isnt necessarily coordination between the cloud vendors themselves.
Im curious about what both of you hear from CTO, CIO, CEO level about what their concerns are, visa vie data portability and just public cloud adoption, in general. I know in financial services, theres some hesitance to go there.
One of the things you can do is decouple storage and compute. This comes into hybrid, and this is where networking is important as well. We have actual large banks, for example, that use those that dont actually store data in our cloud. So, they do the compute and the data in motion is being computed in our cloud, but they dont actually put it, storage wise, in our cloud. Once its computed and the result is there, thats piped back to the data center and thats where the repository of data is. In this way, its kind of like diet public cloud or something, because they dont actually have to worry about the data storage, and it makes a lot of sense. So, they say, Long term data storage is there, in our private cloud. We have all that locked down and then basically, we can actually use the public cloud as a sort of brain that we can do a lot of computing against, and maybe not have the implications of the data storage longer term. This is one thing we see as well.
We recently did, in the last six months, some primary research, talking to CIOs and trying to get their perspective on where they are in terms of using some of the different clouds. One of their initial concerns, as you can imagine, is around security of the data, security of the connections, security of the processing. Is it really theirs? We were talking about it earlier, its a little bit of fear on their part, that theyre not sure if its really under their control. Thats probably the number one fear. I think, interestingly as we were talking, its not necessarily founded on a lot, its just the fear that if I dont own that asset, if its not in my data center than I really cant control it. I think thats a mindset that will change over time and probably driven by economics of it. Thats sort of where it is today.
Some would argue that even if the server is in your data center, you dont have that much control over it.
Well, thats flying on a plane, this is driving a car. People generally have fear of flying, not fear of driving even though driving is more dangerous than flying. Thats not to belittle the concerns that people have with public cloud, because there are reasons for those concerns that need to be addressed by vendors. On the other hand, there is a natural psychological element about the change in control and the different business processes that they have to go through to use the workflow in a public cloud versus a private environment.
What are the primary concerns they should have about public cloud?
I think that the data portability is one, because I think a lot of people are making investments in specific vendors that are totally vendor-specific so they are sunk costs, essentially. There is no utility to that investment if they want to go to another vendor. In fact, they can reverse the process, so its a double investment. The vendor-specific stuff is a really big issue. You can look at big examples on specific clouds that have actually done that, and now they are coupled to that particular vendor. I think, in some cases, they regret it. Thats a huge issue. Security is an issue, but I think theres a lot of things being done with this. On the networking side as well, offering much more secure options for customers to have private-only IP, for example, using public cloud, and connecting it to private clouds. These are the areas of concern that we see. And also, reliability in performance level. One of the things that were looking forward to as a public cloud vendor, is having a lot more transparency about the actual price performance. The combination of the real price and the actual performance delivered, because its not the same on different clouds and different aspects of those same clouds. For one use case, one cloud could be more cost effective for the end user. For another use case, it could be another cloud, and there isnt really that transparency yet. So, there are interesting companies coming out that are really starting to build that matrix for a customer, so they can compare properly.
One of the things I keep hearing– and it may be more from startups and vendors that run on Amazon than from enterprises, but Amazon started out as an infrastructure as a service. Then, its going up the stack, its adding all sorts of more enterprisey services, some of their vendor customers view that as a threat to themselves. And also, if you start supporting more of those higher-value services, you re more locked in. Do you agree with that? As you go up the stack, are you locked in?
I think theres an architectural lock in. Theres a portability from data lock in, and I think theres an overall an embracement of control of your assets. I think, for me, I think about it most in terms of the architectural lock in. Because, once they go down a path, then its hard to change my path, once Ive made a commitment. I think thats one of the fears thats holding some people back, especially the guys around the IT side that are very architecturally oriented. They dont want to go down a path that they dont feel like they can pull back from and go in a different direction. Lets face it, weve seen this move fairly frequently over the last three years, where weve gone from private, public, hybrid. So, you dont want to make an over-commitment where you cant back pedal and then go in a different direction later. For me, I think thats one of things thats holding them back.
Are there any questions in the audience. You can shout them out or go to a mic. Anyone, no?
I can comment on that issue as well. Going back to data sets, when I was at a media show when Amazon announced that they were doing their own encoding service, and there were a lot of people that had encoding services built on Amazon and other clouds. So yes, competing with your own customers can be a risky strategy. I would say it like this, if you put data into a cloud what you want is adata has gravity, so it will attract compute around it and services. Essentially, public clouds become data black holes that all this data gets and compute gets sucked in around it, and there is a big momentum effect from that. What you want, really, is a healthy ecosystem of providers around that. I would argue that a vendor approach for an infrastructure service provider, it is healthier to have a better ecosystem, which is better value for the customer ultimately, would be to not engage in those services. And, actually build a framework and marketplace where the vendor doesnt necessarily have a horse in that race, but actually is providing a powerful framework for different service providers to securely interact with that data for a private customer or a public dataset, or whatever it might be. So, they can actually leverage the fact that its in a public cloud. Then the vendors stay out of that and maybe take a revenue share on an iTunes type model, something like this, and instead be a facilitator rather than actually a competitor directly in those services. I dont believe a public cloud provider can offer fifty services and be the best at all of those, its not possible.
At the same time, they can integrate it a lot more easily, in some cases, so they might have to see the market and they might be the only ones that are well positioned to get it done. So, they may jump in initially, which then creates the tension.
We have a question right here.
Thanks. I wonder if you could give some examples of industries or businesses, particularly in Global 250 or 500 that are building an agile architecture so that they can move around now and in the future to take advantage of what youre talking about.
I can go first. From the service provider side, where I spend a lot of my time, were seeing that the prevailing head wind that every service provider is trying to figure out, how do they become more agile? Not only because theyre competing fiercely for their services and its being, in many cases, commoditized but theyre also in parallel, trying to create an environment for service innovation so they dont have to go through that commodity path. So, were seeing this new giant agile architecture which is flexible and cloud oriented. Its the separation of a variety of hardware and software capabilities in creating new ways to more efficiently drive new service innovation. That seems to be the mantra of the day. Were hearing that across the globe.
Just quickly, because I know were running out of time and we have another question. I cant name the customer, but basically the customers we have that are the most sophisticated at being cross-cloud, generally what they do is they provision standard virtual machines, and then they use share for puppet to actually then contextualize them. Then, they can run the same virtual machine environments across multiple clouds. The API is something thats really easy to code against and recode against multiple APIs, compared to the data locking and things like that. So, what they do is they have that type of methodology where they have a standardized virtual machine image. Then they can change the recipes and thats how they are able to work within multiple cloud environments. Thats the easiest way and most effective way we see.
You talked about banks using the infrastructure as a brain, whats your take on the bandwidth? How could upload terabytes of data and then re-upload it again when you want to calculate it again?
Im sure Bill has a lot to say on the networking, so Ill give you time. I just wanted to quickly say in this specific case, they have a private patch so they have multiple 10 Gig patches that go straight into our cloud, and its actually mapped one-to-one with their private network within the cloud, which were able to do. So, there isnt an overhead of a VPN or something like this. In this case, theyre able to move large scale data through, because theyre in a large metropolitan area, and they can connect relatively easy to us. Usually, they would do a private patch because at the inflection point, it commercially makes sense for them rather than being on general IP.
Just quickly, were seeing that there is two new, evolving technologies. Ones called network functional virtualization and the other is software-defined networks. Where that comes together, it allows you to have more flexible control over your software assets, where you dont have to deploy hardware. From a networking perspective, between servers within or amongst different data centers, you can actually scale up and scale down the network much more quickly. If I have a workload that might have a very low need today but tomorrow it suddenly bursts up, I can allow it to burst without having to put physical hardware out there. Lets say Im an MPLS environment. They can actually scale up into the cloud much faster without having to pre-provision or over provision. It gives them more flexible dynamic provisioning.
We hit the red stop sign. Thank you very much for sticking around, see you tomorrow.