Dumbing down or smartening up — which is the bigger opportunity for big data?

[protected-iframe id=”aa23596ec7626308bf1c8f0a7f401232-14960843-25766478″ info=”http://new.livestream.com/accounts/74987/events/2361507/videos/30327423/player?autoPlay=false&height=360&mute=false&width=640″ width=”640″ height=”360″ frameborder=”0″ scrolling=”no”]
Session Name: Cloud, Data and where the World is Headed.

Announcer

Derrick Harris
Michael Abbott
Jonathan Heiliger

Announcer 00:04

Thank you, sir. Just to remind everybody, the conference is being live-streamed, and there will be– Hello, everybody watching. All the sessions will be available on demand. You can find them on GIGAOM.com We’ll have all the sessions, where you can watch full video from it after the show. With that, I’m going to bring out our next panel. Mr. Derrick Harris. He’s the Senior Writer for GigaOM. He’s going to be talking with Michael Abbott – a general partner with Kleiner Perkins Caufield & Byers, and Jonathan Heilinger – a general partner at North Bridge Venture Partners – about cloud, data, and where the world is headed. Please welcome our next panel.

[applause]

[music]

Derrick Harris 00:52

Before we start, I just want to let you know. We’re going to leave five minutes for questions at the end. So if you think of them during the discussion. I want Mike and Jon to introduce themselves before we get started, just because– if you haven’t read their bios– the venture part is part of it, but also we’re going to talk about some of the stuff they’ve done in the past. If you’ll introduce yourselves briefly, that’d be fantastic.

Michael Abbott 01:17

Mike Abbott, General Partner at Kleiner Perkins. Before that, I led the engineering team at Twitter, focusing on redoing the infrastructure there and building the team. Before that, led the design and development of WebOS – so shipped that first version. Prior to that, I actually started a company called Composite Software in the data virtualization space that Cisco recently acquired.

Jonathan Heiliger 01:42

Jonathan Heiliger, Partner at North Bridge Venture Partners in Palo Alto. Prior to the year of being purely an investor, ran the infrastructure engineering for Facebook from when they had about 30 million people using the site to about 800 million people. Prior to that, several other roles in different tech companies – primarily in engineering, operations, and infrastructure. Like Mike, also founded a company in the late ’90s called Global Center, which is in the web hosting space that acquired multiple times. I think it’s one of the– acquired four times or something like that, all with the same address. So it was a lot of different business cards.

[chuckles]

Derrick Harris 02:23

Our topic’s Cloud and big data, and where the world is headed. I’m still not sure the world is– based on Facebook, Twitter, these are large web-scale companies. When you look at what you were doing at those places, let’s say, and then looking at how these same sorts of ideas – Cloud, big data, next generation of infrastructure – how that translates into what you’re doing now. Let’s say, looking at software companies, looking at startups. Is there actually a connection [chuckles] between the Twitters of the world, the Facebooks of the world, and what the average company might need, or what the average enterprise software startup or vendor needs to look at?

Michael Abbott 03:06

I tend to think that it’s not really binary of yester now. I think there’s parts that make sense and there’s parts that don’t. As an example, investments that both Facebook and Twitter have made around business intelligence or self-service, from being able to ask questions of your data makes a great deal of sense. I think there’s many companies in the Fortune 1000 that would love to have that type of ability for IT to enable non-technical users to be able to ask questions of their information. At the same token, there’s other aspects when you’re dealing with technology and problems of that scale that just may not apply. Certainly from the startups that I meet with and focus potential investing in, we want to help them build a product that solves customer problems. Those problems are more painkillers and not vitamins, so they actually pay money. [chuckles] So, that’s obviously a very key requirement. And understanding how corporations actually buy that software and how they want it delivered. They’re their own rules, because many companies have certain standards. There’s a lot of governance questions or issues that companies like Facebook and Twitter just bypass.

Jonathan Heiliger 04:26

Just to build for a moment on what Mike said, I think every enterprise can benefit from more resolution on data and much higher frequency access to data. Where I think the norm a few years ago was, Hey, I’ll wait 24 hours for that report. I’ll wait a few hours for that report. Or something like that. Whereas, the Internet companies of today – Facebook, Twitter, and other large Internet companies included – push the envelope because there are so many decisions you can make in real time for user experience that improve the user experience and improve the interaction that people will have with those products that can also be translated to the BMW website and how you work with the car configurator, how performant that application is. Also, quite frankly, how it can be tuned to what you’re doing at that moment and time, and being able to take into account as many different signals from different sources of data – whether that’s happening in the browser or on a back end system is super critical.

Derrick Harris 05:25

Do you get a sense for how that boils down to– it sounds easy. Everyone can access data and have transparency and everything. As it [crosstalk] turns out, the stuff is really hard. So how do you actually boil that down into something that– Hadoop, for example, is the big buzz word. But that solves a fraction of that problem you’re talking about.

Michael Abbott 05:45

I think part of the answer is actually, not necessarily a technical one. It’s a topic that I frequently talk to folks about – and I’m sure Jonathan does too – which is, many companies what to understand, Hey, I want to move quick. I hear that products get shipped very quickly at Twitter. How do your teams actually get assembled? How do they actually build products? A lot of this is around empowering engineers to go solve these problems on their own, and actually removing a lot of the bureaucracy. One of the interesting that things that’s actually occurred in data as a result of technologies like Hadoop is– ten years ago, you had to get a group of people in a room, define or at least outline what are all the types of questions or reports you might want to ask, build your star schema, build the data warehouse, and then potentially build some data marks off of that. Today, you can collect that data in Hadoop and not necessarily have to make all those decisions, and actually put it to more of a late binding type point where the queries can be run at the end of that process versus having to know all the questions up front. That’s a pretty big shift for companies. One challenge is, how do you explain that power to business makers who actually see the ROI from investing in those types of technologies. I don’t think we’ve totally crossed that yet. I think we’re in the process.

Derrick Harris 07:07

In terms of when you’re looking at a startup company that comes to you and says, Listen, we’re a cloud computing, cloud startup, or a data startup. I think those terms have lost some of their luster. What do you actually look for? What are you actually looking for at this point? What kind of product do you need to have other than, We’re delivering something as a service?

Jonathan Heiliger 07:32

I think it starts with this general thing when evaluating a team of people, is looking for a really fundamental problem that they’re solving. Hopefully, with a very unique approach or something that’s disruptive. It’s actually not that dissimilar from looking at technology that you’re going to buy and bring into your company. When you think about things of, I’m willing to swap out vendor A for vendor B if there’s a 5X performance difference or something like that. At least, when I look at new prospective ventures, someone has to be able to commit and articulate a vision that says I can make this problem 10X better or 100X or even 1,000X better in order to really have some follow-up and dig into, what is it you’re solving? It’s big data and it’s analysis. We’re making the user experience better, and this and that to actually understand what their vision is and what problem they’re solving. But it all starts with, there has to be some fundamental change. It’s not just you used to run this job on Oracle, now you can run it here and it’s twice as fast. Guess what? That’s actually not that interesting. Oracle will get there in the next product cycle, for example.

Michael Abbott 08:43

To your point, they technical defensibility is just critical. We spend a lot of time on the team understanding what are their backgrounds, what gives them that unique ability to do things that other companies actually can’t. But at the same time, understand as they go build that great product, how do they imagine that getting actually into the enterprise. One of the things that we periodically see is that we might have a great group of engineers that have a particular hypothesis or an idea around a product, but they really don’t have a strong sense of how to bring that into a company, into a business unit, into a business. So sometimes you can actually help them find that salesperson to bridge that gap. Sometimes you have to cautious. It might just be a technology chasing a problem. So one of the things that we do to help mitigate that is bring in and have periodic meetings with various companies, CIOs, to try to make sure we understand the problems that people are actually facing today, and try to match those up with some of the products that we see companies actually building.

Derrick Harris 09:47

Do you guys have a sense of what the ideal applications are, now that we’re almost ten years into cloud computing as a legitimate thing? As big data’s the term that’s come and gone, and in vogue, out of vogue. But it’s a legitimate movement to this point. Do you get a sense for what the big applications are? If I’m a startup or if I’m even a business trying to adopt some of this stuff, how should I be looking at– what’s this actually going to get me, aside from just doing what I do with Oracle [chuckles] two times faster? What can I hope to achieve that’s revolutionary?

Michael Abbott 10:21

If you step back and look at the numbers alone, there’s about 2.8 Zettabytes of data in the world. Less than 1% of that’s been analyzed. That presents a huge opportunity both on the investors side, but also for companies to look at how do you provide analytics tools to actually get more value out of this data that you’re actually storing? There is a real step forward around the discovery component of that that we’re just actually embarking on. Because again, traditionally, business intelligence vendors – whether it be Business Objects or Cognos – you had to know in advance what were the questions you wanted to ask, and then you stopped once you got that answer. In today’s world, you can start with that question, but you can actually have subsequent questions as a result of that first one. That’s actually real important as part of that whole discovery process to get more ROI out of the investments you’re making in big data.

Jonathan Heiliger 11:24

One of the things that struck me is that data science is a science. It’s part science, part art. We as an industry haven’t done a very good job at building curriculum and creating incentives to create heroes and role models within the data science community. Because I do think that there’s an opportunity to build better and better tools to dumb down analytics and make it easier for guys like me that use PowerPoint everyday instead of engineers that use VI and EMax to query data. But I’d even back up a step. I totally agree with that Mike is saying, which is that technology has enabled us to ask questions as we delve into data and steer around data at very high resolutions. But the bigger opportunity here is actually being able to educate people on data science and turn this into a real practice, much like you educate people on how to develop code in JAVA or C++ versus being a marketing person. Those are very different disciplines, just as the science of analyzing data is.

Derrick Harris 12:31

What about the data center level or at the infrastructure level? Is there real value in saying, We want to operate– part of a software defined– part of that whole thesis is that it’s operated like the web companies do to some degree. What Facebook’s doing with Open Compute, and Twitter now is talking about how it’s rewritten its applications and everything five times. Is there some value–? [crosstalk] Well, [chuckles] a few times – parts of it. Is there some value if I’m a regular company, saying, I want to look like this. I want to run like these guys. Or is that not tangible or even–?

Jonathan Heiliger 13:08

I think it’s really, really simple, actually to figure out if you should build your own versus use someone else’s infrastructure or software. It all starts with a TCO now. What is the total cost of ownership of this particular asset, whether it’s a server, a data center, an analytics tool or something like that? If you can do it cheaper yourself– where cheaper can mean actual dollars or it can mean flexibility in customization. I think that’s one thing that a lot if Internet companies have shown, is that open source may appear cheaper economically, but the actual real benefit that it creates to the company is having the flexibility to change the direction of that piece of software or hardware from Monday to Friday. If you can ascribe economic value to that ownership and that control, then by all means, you should be building data centers, you should be building hardware, taking advantage of software controlled networks, and those sorts of things. But if you can’t, then why wouldn’t you buy from someone else because that’s essentially saving you money, and focus on your core business, whatever that is – agriculture, finance, or gaming.

Michael Abbott 14:15

I see that the whole data center is being re-imagined. I think that we’re seeing virtualization certainly on compute. We’re seeing virtualization now in stores. We’re seeing now virtualization in networking, which is really what STM’s about. To your point, there’s actually a pretty great efficiency gained from many organizations looking at that virtualization opportunity. I just want to come back to your prior question of future things. Beyond analytics, it’s really interesting to me– is once we actually have the big data infrastructure, what new applications are going to be built on top of that. What we haven’t seen yet is, if you were going to go build a next generation CRM app and you had this data, what would that actually look like? We haven’t seen that before. A couple years out, I think that application layer is going to get re-imagined as well.

Derrick Harris 15:03

I think we’re starting to see that now. I have seen that next generation CRM [chuckles] – your example. I think there’s a couple of those things floating around.

Jonathan Heiliger 15:11

There’s things that plug into your e-mail that monitor the last time that you and I exchanged e-mails, and tell you that Mike is a fading contact. That’s something you can do with access to all of that data that maybe you couldn’t do before – getting back to the analytics problem.

Derrick Harris 15:27

I just hope there more than social [chuckles]. [crosstalk] We’ll see. So we have a couple of minutes left. Do we have any questions? If so, you can step up to one of the microphones. Questions?

Jonathan Heiliger 15:38

Shy group of people this morning.

Derrick Harris 15:39

I have more.

Michael Abbott 15:40

Need more coffee.

[chuckles]

Derrick Harris 15:43

Here’s another thing. I get a lot of pitches and talk to a lot of startups who come out of companies like the Facebooks and the Twitters and Google, and they all have ideas based on stuff they’ve been doing. So we were talking about Mesos before, sometimes that’s something– we’re building Google F1 or whatever. How do you draw a line between– especially if you’re an enterprise looking at some of these technologies or if you’re an investor, where do you draw that line between this is actually targeting me or targeting a broad audience? This sounds cool, but at the end of the day, this is like trying to sell something that Google [chuckles] might want to buy.

Michael Abbott 16:30

This is why, at least, firms like ours build a pretty strong network and relationships with a lot of firms that are actually buying the technology, regardless of if it’s a company like Mesos, what not. What we do is we get on the phone or do meetings with Goldman Sachs, with JP Morgan, with Starbucks, and other companies, and maybe some of the folks in this room – your companies, to understand what are those problems and actually, how would they actually look at this technology. Because typically, if they’re solving a real problem again and it’s not just a vitamin and nice to have, the particular company has a strong desire, actually, to get that solution in. It’s also a way for us on the investment side to mitigate the risk of, is there actually real market demand for what they’re building.

Jonathan Heiliger 17:23

I do think a lot of that validation comes from talking to our networks, but I also think a lot of the Internet companies ended up building with very small teams some incredible products that would otherwise be a 20-person or 50-person team if it was selling to enterprises. So to be able to take the leverage that that team has created inside of a place like Google and turn that into a stand-alone product, I think can be wonderful. But to Mike’s point, it is a question of actually validating that other people are going to have that problem, and to some degree forecasting. I think there’s a large class of next generation of web companies that are probably their likely targets – maybe less so than the banks, for example, or something like that in terms of acquiring these kinds of technologies. Simply because, while the Internet companies are small, making a decision may yield this much difference in gross margin or cost or something like that. But as these businesses scale, all the sudden there’s incredible leverage in those early decisions you make in the investments in technology. Being able to plug in early on to a technology like Mesos or something like that, that can allow you to scale very rapidly, very economically, can make a huge, huge difference to that company.

Derrick Harris 18:38

Last question. We’re over time, but one word answer from each of you. If there’s one thing you’re seeing – one technology trending upward – that mainstream companies should look at as the real deal– something they should invest in, what is it?

Michael Abbott 18:54

One word?

Derrick Harris 18:54

One word or maybe two, if it’s a phrase – like software–

Michael Abbott 18:59

Right now, it’s analytics. I think there’s a lot of questions. Companies are saying, Look, I understand what Hadoop can provide me, but what do I get? Or I’m storing all this data and I’m dealing with this data obesity problem. How do I actually show to my business owners the value?

Jonathan Heiliger 19:18

I think enterprise buyers are still frustrated enterprise vendors, quite frankly. So the idea of the software-defined data center is becoming much more mainstream than it was a year ago, and so forth and so on. So I think that’s on the rise, whereas we’re sort of peeking out on data obesity.

Derrick Harris 19:37

Thank you.

[applause]

[music]

Announcer 19:48

Heiliger when? Today. All other [inaudible] can just go home. You can go on a break right now, because we are entering our morning break. So I want to remind you that we have three sponsor workshops going on upstairs on Level 1: EMC sponsored workshop, Aspera, and Softlayer. So please, go check those out and get there early because they do fill up. Then stop by the GigaOM research table to learn more about our research service, how you can get in on those sweet mapping sessions, and get your report as well. Then there are breaks and refreshments located throughout the area. Please make yourself at home, enjoy, and we’ll see you at 11:15. Have a good break everybody.