Voices in AI – Episode 66: A Conversation with Steve Ritter


About this Episode

Episode 66 of Voices in AI features host Byron Reese and Steve Ritter talk about the future of AGI, how AI will effect jobs, security, warfare, and privacy. Steve Ritter holds a B.S. in Cognitive Science, Computer Science and Economics from UC San Diego and is currently the CTO of Mitek.
Visit www.VoicesinAI.com to listen to this one-hour podcast or read the full transcript.

Transcript Excerpt

Byron Reese: This is Voices in AI, brought to you by GigaOm, I’m Byron Reese, and today our guest is Steve Ritter. He is the CTO of Mitek. He holds a Bachelor of Science in Cognitive Science, Computer Science and Economics from UC San Diego. Welcome to the show Steve.
Steve Ritter: Thanks a lot Byron, thanks for having me.
So tell me, what were you thinking way back in the ’80s when you said, “I’m going to study computers and brains”? What was going on in your teenage brain?
That’s a great question. So first off I started off with a Computer Science degree and I was exposed to the concepts of the early stages of machine learning and cognitive science through classes that forced me to deal with languages like LISP etc., and at the same time the University of California, San Diego was opening up their very first department dedicated to cognitive science. So I was just close to finishing up my Computer Science degree, and I decided to add Cognitive Science into it as well, simply because I was just really amazed and enthralled with the scope of what Cognitive Science was trying to cover. There was obviously the computational side, then the developmental psychology side, and then neuroscience, all combined to solve a host of different problems. You had so many researchers in that area that were applying it in many different ways, and I just found it fascinating, so I had to do it.
So, there’s human intelligence, or organic intelligence, or whatever you want to call it, there’s what we have, and then there’s artificial intelligence. In what ways are those things alike and in what ways are they not?
That’s a great question. I think it’s actually something that trips a lot of people up today when they hear about AI, and we might use the term, artificial basic intelligence, or general intelligence, as opposed to artificial intelligence. So a big difference is, on one hand we’re studying the brain and we’re trying to understand how the brain is organized to solve problems and from that derive architectures that we might use to solve other problems. It’s not necessarily the case that we’re trying to create a general intelligence or a consciousness, but we’re just trying to learn new ways to solve problems. So I really like the concept of neural inspired architectures, and that sort of thing. And that’s really the area that I’ve been focused on over the past 25 years, is really how can we apply these learning architectures to solve important business problems.
Listen to this one-hour episode or read the full transcript at www.VoicesinAI.com
Byron explores issues around artificial intelligence and conscious computers in his new book The Fourth Age: Smart Robots, Conscious Computers, and the Future of Humanity.

Voices in AI – Episode 64: A Conversation with Eli David


About this Episode

Episode 64 of Voices in AI features host Byron Reese and Dr. Eli David discuss evolutionary computation, deep learning and neural networks, as well as AI’s role in improving cyber-security. Dr. David is the CTO and co-founder of Deep Instinct as well as having published multiple papers on deep learning and genetic algorithms in leading AI journals.
Visit www.VoicesinAI.com to listen to this one-hour podcast or read the full transcript.

Transcript Excerpt

Byron Reese: This is Voices in AI, brought to you by GigaOm. I’m Byron Reese. And today, our guest is Dr. Eli David. He is the CTO and the co-founder of Deep Instinct. He’s an expert in the field of computational intelligence, specializing in deep learning and evolutionary computation. He’s published more than 30 papers in leading AI journals and conferences, mostly focusing on applications of deep learning and genetic algorithms in various real-world domains. Welcome to the show, Eli.
Eli David: Thank you very much. Great to be here.
So bring us up to date, or let everybody know what do we mean by evolutionary computation, and deep learning and neural networks? Because all three of those are things that, let’s just say, they aren’t necessarily crystal clear in everybody’s minds what they are. So let’s begin by defining your terms. Explain those three concepts to us.
Sure, definitely. Now, both neural networks and evolutionary computation take inspiration from intelligence in nature. If instead of trying to come up with smart mathematical ways of creating intelligence, we just look at the nature to see how intelligence works there, we can reach two very obvious conclusions. First, the only algorithm that is in charge of creating intelligence – we started from single-cell organisms billions of years ago, and now we are intelligent organisms – and the main algorithm, or maybe the only algorithm, in charge of that was evolution. So evolutionary computation takes inspiration from the evolutionary process in the nature and trying to evolve computer programs so that, from one generation to other, they will become smarter and smarter, and the smarter they are, the more they breed, the more children they have, and so, hopefully the smart gene improves one generation after the other.
The other thing that we will notice when we observe nature is brains. Nearly all the intelligence in humans or other mammals or the intelligent animals, it is due to a neural network and network of neurons which we refer to as a brain — many small processing units connected to each other via what we call synapses. In our brains, for example, we have many tens of billions of such neurons, each one of them, on average, connected to about ten thousand other neurons, and these small processing units connected to each other, they create the brain; they create all our intelligence. So the two fields of evolutionary computation and artificial neural networks, nowadays referred to as deep learning, and we will shortly dwell on the difference as well, take direct inspiration from nature.
Now, what is the difference between deep learning, deep neural networks, traditional neural networks, etc? So, neural networks is not a new field. Already in the 1980s, we had most of the concepts that we have today. But the main difference is that during the past several years, we had several major breakthroughs, while until then, we could train only shallow neural networks, shallow artificial neural networks, just a few layers of neurons, just a few thousand synapses, connectors. A few years ago, we managed to make these neural networks deep, so instead of a few layers, we have many tens of layers; instead of a few thousand connectors, we have now hundreds of millions, or billions, of connectors. So instead of having shallow neural networks, nowadays we have deep neural networks, also known as deep learning. So deep learning and deep neural networks are synonyms.
Listen to this one-hour episode or read the full transcript at www.VoicesinAI.com
Byron explores issues around artificial intelligence and conscious computers in his new book The Fourth Age: Smart Robots, Conscious Computers, and the Future of Humanity.

Voices in AI – Episode 22: A Conversation with Rudina Seseri

In this episode, Byron and Rudina talk about the AI talent pool, cyber security, the future of learning, and privacy.
[podcast_player name=”Episode 22: A Conversation with Rudina Seseri” artist=”Byron Reese” album=”Voices in AI” url=”https://voicesinai.s3.amazonaws.com/2017-11-20-(01-05-05)-rudina-seseri.mp3″ cover_art_url=”https://voicesinai.com/wp-content/uploads/2017/11/voices-headshot-card-3-1.jpg”]
Byron Reese: This is Voices in AI brought to you by Gigaom. I’m Byron Reese. Today, our guest is Rudina Seseri. She is the founding and manager partner over at Glasswing Ventures. She’s also an entrepreneur in residence at Harvard Business School and she holds an MBA from that same institution. Welcome to the show, Rudina.
Rudina Seseri: Hello Byron. Thank you for having me.
You wrote a really good piece for Gigaom, as a matter of fact; it was your advice to startups—don’t say you’re doing AI just to have the buzzwords on the side, you better be able to say what you’re really doing.  
What is your operational definition of artificial intelligence, and can you expand on that theme? Because I think it’s really good advice.
Sure, happy to. AI—and I think of it as the wave of disruption—has become such a popular term, and I think there are definitional challenges in the market. From my perspective, and at the very highest level, AI is technology, largely computers and software, that possesses or has some level of intelligence that mirrors that of humans. It’s as basic as one would imagine it to be by the very name artificial intelligence.
Where I think we are in the AI maturity curve, if one wants to express it in such a form, is really the early days of AI and the impact it is having and will have going forward. It’s really, what I would call, “narrow AI” in that we’re not at a point where machines, in general, can operate at the same level of diversity and complexity as the human mind. But for narrow purposes, or in a narrow function—for a number of areas across enterprise and consumer businesses—AI can be really transformational, even narrow AI.
Expressed differently, we think of AI as anything—such as visual recognition, social cognition, speech recognition—underpinned with a level of machine learning, with a particular interest around deep learning. I hope that helps.
That’s wonderful. You’re an investor so you get pitches all the time and you’re bound to see ones where the term AI is used, and it’s really just in there to play “buzzword bingo” and all of that… Because, your definition that it’s, “doing things humans would normally do” kind of takes me back to my cat food bowl that fills itself up when it’s empty. It’s weighing and measuring it so that I don’t have to. I used to do it, and now a computer does it. Surely, if you saw that in a business case, like, “We have an AI cat food bowl,” that really isn’t AI, or is it? And then you’ve got things like the Nest, which is a learning system. It learns as you do it, and yours is eventually going to be different than mine—I think that is clearly in the AI camp. What would be a case of something that you would see in a business case and just roll your eyes?
To address your examples and give you a few illustrations, I think in your example of the cat food plate or whatnot, I think you’re describing automation much more than AI. And you can automate it because it’s very prescriptive—if A takes place, then do B; if C takes place, then do D. I think that’s very different than AI.
I think when technologies and products are leveraging artificial intelligence, you are really looking for a learning capability. Although, to be perfectly honest, even within the world of artificial intelligence, researchers don’t agree on whether learning, in and of its own, qualifies as AI. But, coming back to everyday applications, I think, much like the human mind learns, in artificial intelligence, whatever facet of it, we are looking for some level of learning. For sure, there’s a differentiator.
To then address your question head on, my goodness, we’re seeing AI disrupt all facets—from cyber security and martech to IT and HR to new robotics platforms—it’s running the whole gamut. Why don’t I give you a perfect example, that’s a real example, and I can give you the name of a portfolio company so we make it even more practical and less hypothetical?
One of my recent investments is a company called Talla. Talla is taking advantage of natural language processing capabilities for the HR and IT organizations in particular, where they’re automating lower level tickets, Q&A for issues that an employee may have—maybe an outage of email or some other question around an HR benefit—and instead of having a human address the question, it is actually the bot that’s addressing the question. The bot is initially augmenting, so if the question is too complex and the bot can only take the answer so far and can’t fully address the particular question, then the human becomes involved. But the bot is learning, so when a second person has a similar question, the bot can actually address it fully.
In that instance, you have both natural language processing and a lot of learning, because no two humans ask the very same question. And even if we are asking the same question, we do not ask it in the same manner. That’s the beauty of our species. So, there’s a lot of learning that goes on in that regard. And, of course, it’s also the case that it’s driving productivity and augmentation. Does that address your question, Byron?
Absolutely. That’s Rob May’s company, isn’t it?
Yes, it is.
I know Rob; he’s a brilliant guy.
Specifically, with that concept, as we are able to automate more things at a human level, like customer service inquiries, how important do you think it is that the end-to-end user knows that they’re talking to a bot of some kind, as opposed to a person?
When you say “know,” are you trying to get at the societal norm of what… Is this a normative question?
Exactly. If I ask where is your FAQ and “Julia”—in air quotes—says, “Here. Our FAQs are located here,” and there was no human involved, how important is it that I, as an end user, know that it’s called “Julia Bot” not “Julia”?
I think disclosure is always best. There’s nothing to be hidden, there’s nothing that’s occurring that’s untoward. In that regard, I would personally advocate for erring on the side of disclosure rather than not, especially if there is learning involved, which means observing, on the part of the bot. I think it would be important. I also think that we’re in the early days of this type of technology being adopted and becoming pervasive that the best practices and norms have yet to be established.
Where I suspect you will see both, is what I call the “New York Times risk”—where we’ll have a lot more discussion around what’s an acceptable norm and what’s right and wrong in this emerging paradigm—when we read a story where something went the wrong way. Then we will all weigh in, and the bodies will come together and establish norms. But, I think, fundamentally, erring on the side of disclosure serves a company well at all times.
You’re an investor. You see all kinds of businesses coming along. Do you have an investment thesis like, “I am really interested in artificial intelligence applied to enterprises”? What is your thesis?
We refer to our thesis as—not only do we have a thesis, but I think we have a good name to capture it—“Intelligent, Connect and Protect,” wherein our firm strategy is to invest in startups that are really disrupting, in a positive manner, and revolutionizing the enterprise—from sales tech and martech, to pure IT and data; around platforms, be those software platforms or robotics and the like; as well as cyber security and infrastructure.
So that first part, around enterprise and platforms, is the “Connect” world and then the cyber security and the infrastructure is the protection of that ecosystem. The reason why we don’t just call it “Connect and Protect” is because with every single startup that we invest in, core to our strategy is the utilization, or taking advantage, of artificial intelligence, so that is the “Intelligent” part in describing, or in capturing our thesis.
Said differently, we fundamentally believe that if a technology startup, in this day and age, is not leveraging some form of machine learning, some facet of AI, it’s putting itself at a disadvantage from day one. Put more directly, it becomes legacy from the get-go, because from a performance point of view those legacy products, or products without any kind of learning and AI, just won’t be able to keep up and outperform their peers that do.
You’re based in Boston. Are you doing most of your investing on the East Coast?
For the most part, correct. Yes. East Coast, and in other pockets of opportunity where our strategy holds. There are some interesting things in areas like Atlanta with security, even in certain parts of Europe like London, Berlin, Munich, etcetera, but yes.
Are AI being used for different things on the East Coast than what we think of in Silicon Valley? Can you go into that a little more? Where do you see pockets that are doing different things?
I think AI is a massive wave, and I think we would be in our own bubble if we thought that it was divided by certain coasts. Where I think it manifests itself differently, however—and I think it’s impacting at a global level to be honest rather than in our own microcosms—is where you see a difference in the concentration of the talent pool around AI, and especially deep learning. Because, keep in mind, the notion of specializing in machine learning or visual cognition, but particularly deep learning, is the best example, it didn’t exist before 2012. We talk a lot about data scientists, but the true data scientists and machine learning experts are very, very hard to come by, because it is, in many ways, driven by the explosion in data, and then the maturity that the whole deep learning field is achieving to be commercializable, for the techniques to be used in real products. It’s all very new, only existing in the last five to—if you want to be generous—ten years.
From that perspective, where talent is concentrated makes a difference. To come back to how, maybe, the East Coast compares, I think we will see AI companies across the board. I’m very optimistic, in that I think we have quite a bit of concentration of AI through the universities on the east coast. I think of MIT, Carnegie Mellon, and Cornell; and what we’re seeing come out of Harvard and BU on the NLP side.
Across the universities, there are very, very deep pockets of talent, and I think that manifests itself both with the number and high quality of AI-enabled products and startups that we’re seeing get launched, but also for, what one would call, the “incumbents” such as Facebook, Amazon, Google, Uber, and the list goes on; if you look closely at where their AI teams are—even though almost all the companies I just mentioned are headquartered in the Valley and, in the case of Amazon, in Seattle—their AI talent is concentrated on the East Coast; probably most notably is Facebook’s AI headquartered in New York. So, combine that talent concentration with the market, that we, in particular, focus with our strategy around—the enterprise—where the East Coast has always had, and continues to have an advantage, I think, it’s an interesting moment in time.
I assume with the concentration of government on the East Coast and finance on the East Coast, that you see more technologies like security and those sorts of things. Specifically, with security, there’s been this game that’s gone back and forth for thousands of years between people who make codes, and people who break them. And nobody’s ever really come to an agreement about who has the harder job. Can you make an unbreakable code, and can it be broken? Do you think AI helps those who want to violate security, or those who want to defend against those violations, right now?
I think AI will play an important role in defending and securing the ecosystem. The reason I say that is because, in this day and age, with the exploding number of devices, and pervasive connectivity everywhere—translated in cyber security lingo, an increase in the number of endpoints, the areas of vulnerability, whether it is at the network level and device level or whether it is at the data and identity levels—has made us a lot more vulnerable, which is sort of the paradigm we live in.
Where I think AI and machine learning can be true differentiators is that not only can they be leveraged, again, for the various software solutions to continuously learn, but also on the predictive side they can point out where a vulnerability attack is being predicted before it actually takes place. There are certain patterns that help the enterprise to hone in on the vulnerability—from assessment to time of attack, at or during the attack, and then post attack. I do think that AI is a really meaningful differentiator for cyber security.
You alluded, just a moment ago, to the lack of talent; there just aren’t enough people who are well-versed in a lot of these topics. How does that shake out? Do you think that we will use artificial intelligence to make up for shortage of people with the skills? Or, do you think that universities are going to produce a surge of new talent coming in? How do we solve that? Because you look out your window, and almost everything you see, you could figure out how we could use data to study that and make it better. It’s kind of a blue ocean. What do you think is going to happen in the talent marketplace to solve for that?
AI eventually will be a layer, you’re absolutely right. From that perspective, I cannot come up with an area where AI will not play a role, broadly put, for the foreseeable future and for a long time in the future.
In terms of the talent challenge, let me address your question twofold. The talent shortage challenge that we have right now stems from the fact that it’s a relatively new field, or the resurgence of the field, and the ability to now actually deploy it in the real world and commercialize; this is what’s driving this demand. It’s the demand that has spurred it, and of course, the supply for that adjustment to take place requires talent, if I can think of it in that manner, and it’s not there. It’s a bit of a matter of market timing at one level. For sure, we will see many more students enter the field, many more students specialize and get trained in machine learning.
Then the real question becomes will part of their functions be automated? Will we need fewer humans to perform the same functions, which I think was the second part of your question if I understood it correctly?
I think we’re in a phase of augmentation. And we’ve seen this in the past. Think about this, Byron: how did developers code, going back ten to fifteen years ago? Largely, in different languages, but largely, from the ground up. How do they code today? I don’t know of any developer who doesn’t use the tools available to get a quick spin up, and to ramp up quickly.
AI and machine learning are no different. Not every company is going to build their own neural net. Quite the opposite. A lot of them will use what’s open source and available out there in the market, or what’s commercialized for their needs. They might do some customization on top, and then they will focus on the product they’re building.
The fact that you will see part of the machine learning function that’s being performed by the data scientists be somewhat automated should come as no surprise, and that has nothing to do with AI. That has to do with driving efficiencies and getting tools and having access to open source support, if you will.
I think down the road—where AI plays a role both in augmentation and in automation—we will see definitional changes to what it means to be in a certain profession. For example, I think a medical doctor of the future might look, from a day-to-day activity point of view, very differently than what we perceive a doctor’s role to be—from interaction to what they’re trained at. The fact that a machine learning expert and a data scientist—which by the way are not the same thing but for the sake of argument, I’m using them interchangeably—are going to use tools, and not start from scratch but are going to leverage some level of automation and AI learning is par for the course.
When I give talks on these topics, especially on artificial intelligence, I always get asked the question, “What should I, or what should my children, study to remain employable in the future?”—and we’ll talk about that in a minute, about how AI kind of shakes up all of that.
There are two kind of extreme ends on this. One school of thought says everyone in school should learn how to code, everyone. It’s just like one of the three R’s, but it starts with a C. Everyone should learn to code. And then Mark Cuban, at South by Southwest here in Austin, said that the first trillionaires are going to be from AI companies because it offers the ability to make better decisions, right? And he said if he were coming up today, he would study philosophy, because it’s going to be that kind of thinking that allows you to use these technologies, and to understand how to apply them and whatnot.
On that spectrum of everyone should code, or no, we might just be making a glut of people to code, when what we really need are people to think about how to use these technologies and so forth, what would you say to that?
I have a 4-year-old daughter, so you better believe that I think about this topic quite a bit. My view is that AI is an enabler. It’s a tool for us as a society to augment and automate the mundane, and give us more ability and more room for creativity and different thinking. I would hope to God that the students of the future will study philosophy, they will study math, they will study the arts, they will study all the sciences that we know, and then some. Creativity of thinking, and diversity of thinking will remain the most precious asset we have, in my view.
I do think that, much like children today study the core hard sciences of math and chemistry and biology as well as literature, part of the core curriculum in the future will probably be some form of advanced data statistics, or inter machine learning, or some level computer sciences. We will see some technology training that becomes core, but I think that is a very, very, very different discussion than, “Everybody should study computer science or, looking forward, everybody should be a roboticist or machine learning expert or AI expert.” We need all the differentiation in thinking that we can get. Philosophy does matter, because what we do today shapes the present and society in the future.
Back to the talent question, to your point about someone who is well-versed in machine learning—which is different than data science, as you were saying—do you think those jobs are very difficult, and we’re always going to have a shortage of them because they’re just really hard? Or, do you think it’s just a case that we haven’t really taught them that much and they’re not any harder than coding in C or something? Which of those two things do you think it is?
I think it’s a bit more the latter than the former, that it’s a relatively new field. Yes, math and quants matter in this area, but it’s a new field. It will be talent that has certain predisposition around, like I said, math and quants, yes for sure. But, I do think that the shortage that we experience has a lot more to do with the newness of the field rather than the lack of interest or the lack of qualified talent or lack of aptitude.
One thing, when people say, “How can I spot a place to use artificial intelligence in my enterprise?” one thing I say is find things that look like games. Because every time AI wins in chess, and beats Ken Jennings in Jeopardy and Lee Sedol in Go—the games are really neat because they are these very constrained universes with definable rules and clear objectives.
So, for example, you mentioned HR in your list of all the things it was going to affect, so I’ll use that one. When you have a bunch of resumes, and you’ve hired some people that get great performance reviews, and some people that don’t, and you can think of them as points, or whatever—and you can then look at it as a big game, and you can then try to predict, you know? You can go into each part of the enterprise and say, “What looks like a game here?” Do you have a rule like that or just a guiding metaphor in your own mind? Because, you see all these business plans, right? Is there something like that, that you’re looking for?
There were several questions embedded in this. Let me see if I can decouple a couple of them. I think any area that is data-driven, any facet of the enterprise that is data-driven or that there is information, I think you can leverage learning and narrow AI for predictive, so you used some of the keywords. Is there opportunities for optimization? Are there areas where analytics are involved where you can move away from basic statistical models, and can start leveraging AI? I think where there is room for efficiency and automation, you can leverage it. It’s hard not to find an area where you can leverage it. The question is where can you create the most value?
For example, if you are on the forefront of an enterprise on the sales side, can you leverage AI? Of course, you can—not all prospective customers are created equal, there are better funnels, you can leverage predictives; the more and better data you have, the better are the outcomes. At the end of the day, your neural net will perform as well as the data you put in: junk in, junk out. That’s one facet.
If you’re looking at the marketing and technology side, think about how one can leverage machine learning and predictives around advertising, particularly on the programmatic side, so that you’re personalizing your engagement in whichever capacity with your consumer or your buyer. We can go down the list, Byron. I think the better question is what are the lower-hanging fruits that I can start taking advantage of AI right away, and which ones will I wait on rather than do I have any areas? If the particular manager or business person can’t find any areas, I think they’re missing the big picture, and the day-to-day execution.
I remember in the ‘90s when the consumer web became a big thing, and companies had a web department and they had a web strategy, and now that’s not really a thing, because the internet is part of your business. Do you think we’re like that with artificial intelligence, where it’s siloed now, but eventually, we won’t talk about it the way we’re talking about it now?
I do think so. I often get asked the very same question, “How do I think AI will shape up?” and I think AI will be a layer much like the internet has become a layer. I absolutely do. I think we will see tools and capabilities that will be ever pervasive.
Since AIs are only as good as the data you train them on, does it seem monopolistic to you that certain companies are in a place where they can constantly get more and more and more data, which they can therefore use to make their businesses stronger and stronger and stronger, and it’s hard for new entrants to come in because they don’t have access to the data? Do you think that data monopolies will become kind of a thing, and we’ll have to think about how to regulate them or how to make them available, or is that not likely?
I think the possession of data is, for sure, a barrier to entry in the market, and I do think that the current incumbents, probably more than we’ve ever seen before, have built this barrier to entry by amalgamating the data. How it will shake out… First of all, two thoughts: one, even though they have amassed huge amounts of data with this whole pervasive connectivity, and devices that stay connected all the time, even the large incumbents are only scratching the surface of the data we are generating, and the growth that we’ll continue to see on the data side. So, even though it feels oligarchy-like, maybe—not quite monopolistic—that the big players have so much data, I think we’re generating even more data going forward. So that’s sort of at the highest level.
I do think that, particularly on the consumer side, something needs to be done around customers taking control of their data. I think brands and advertisers have been squatting on consumer data with very little in return for us. I think, again, one can leverage AI in predictives, in that regard, to compensate—whether it’s through an experience or in some other form—consumers for their personal private data being used. And, we probably need some form of regulation, and I don’t know if it’s at the industry standard level, or with more regulatory bodies involved.
Not sure if you follow Sir Timothy Berners-Lee who invented the web, but he does talk a lot about data centralization. I think there is something quite substantive in his statements around centralizing the web and all the data and giving consumers a say. I think we’re seeing a bit of ground swell in that regard. How it will manifest itself? I’m not quite sure, but I do think that the discussion around data will remain very relevant and become even more important as the amount of data increases, and as it becomes critical in a barrier to entry for future businesses.
With regard to privacy in AI, do you think that we are just in a post-privacy world? Because so much of what you do is recorded one way or the other that data just exists and we’ll eventually get used to that. Or do you think people are always going to insist on the protections that you’re talking about, and ways to guarantee their anonymity; and that the technology will actually be used to help promote privacy, not to wear it down?
I think we haven’t given up on privacy. I think the definition of privacy might have changed, especially with the millennials and the social norms that they have been driving, and, largely, the rest of the population has adopted. I’d say we have a redefinition of privacy, but for sure, we haven’t given up on it; even the younger generations who often get accused of doing so. And you don’t need to take my word on it, look at what happened with Snap. Basically, in the early days, it was really almost tweens but let’s say it was teenagers who were on Snapchat and what they were doing was “borderline misbehavior” because it was going to go away, it wouldn’t leave a footprint. The value prop being that it disappears, so your privacy, your behavior, does not become exposed to the broader world. It mattered, and, in my view, it was a critical factor in the growth that the company saw.
I think you’d be hard pressed to find people, I’m sure they exist but I think they are in the minority, that would say, “Oh, I don’t care. Put all of my data, 24/7, let the world know what I’m up to.” Even on the exhibitionist side, I think there’s a limit to that. We care about privacy. How we define it today, I suspect, is very different than how we defined it in the past and that is something that’s still a bit more nebulous.
I completely agree with that. My experience with young people is they are onto it, they understand it better and they are all about it. Anyway, I completely agree with all of that.
So, what about European efforts with regard to the “right to know why”? If an artificial intelligence makes a decision that impacts your life—like gives you a loan or doesn’t—you have the right to know how that conclusion was made. How does that work in a world of neural nets where there may not be a why that’s understandable, kind of, in plain English? Do you think that that is going to hold up the development of black box systems, or that that’s a passing fad? What are your thoughts on that?
I think Europe has always been on the side of protecting consumers. We were just thinking about privacy, and look at what they are doing with GDPR, and what’s coming to market from the data point of view on the topic we were just wrapping up. I think, as we gain a better understanding of AI and as the field matures, if we hide behind, “We don’t quite know how the decision was made,” and we may not fully comprehend but if we hide behind the, “Oh, it’s hard to explain and people can’t understand it,” I think at some point it becomes a cop-out. I don’t think we need to educate everyone on how neural nets and deep learning are performed, but I think you can talk about the fundamentals of what are the drivers, how are they interacting with each other, and at a minimum, you can give the consumer some basic level of understanding as to where they probably outperformed or underperformed.
It reminds me, in tech, we used to use acronyms in talking to each other, and making everybody feel like they were less intelligent than the rest of the world. I don’t think we need to go into the science of artificial intelligence machine learning to help consumers understand how decisions were made. Because guess what? If we can’t explain it to the consumer, the person on the other side that’s managing the relationship will not understand it themselves.
I think you’re right, but, if you ask Google, “Why did this page come number one for this search?” the answer, “We don’t know,” is perfectly understandable. It’s six hundred different algorithms that go into how they rank pages—or whatever the number is, it’s big. So, how can they know why this is page number one and that is page number two?
They may not know fully, or it may take some effort to drill in specifically as to why, but at some level they can tell you what some of the underlying drivers were behind the ranking or how the ranking algorithms took place etcetera, etcetera. I think, Byron, what you and I are going back and forth on is, in my view, it’s a level of granularity question, rather than can they or can they not. It’s not a yes or a no, it’s a granularity question.
There’s a lot of fear in the world around the effect that artificial intelligence is going to have on people, and one of the fear areas is the effect on jobs. As you know, there kind of are three narratives. One narrative is that there are some people who don’t have a lot of training in things that machines can’t do, and the machines are eventually going to take their jobs, and that we’ll have some portion of the population that’s permanently unemployed, like a permanent Great Depression.
Then there’s a school of thought that says, “No, no, no. Everybody’s replaceable by a machine, that eventually, they’re going to get to a point where they can learn something new faster than a human, and then we’re all out of work.”
And then there’s a third group that says, “No, no, no, we’re not going to have any unemployment because we’ve had disruptive technologies: electricity, replacing animals with machines, and steam; all these really disruptive technologies, and unemployment never spiked because of those. All that happens is people learned to use those tools to increase their own productivity.”
My question to you is, which of those three narratives, or is there a fourth one, do you identify with?
I would say I identify only in part with the last narrative. I do think we will see job displacement. I do think we will see job displacement in categories of workers that we would have normally considered highly-skilled. In my view, what’s different about the paradigm we are in vis-à-vis, let’s say, the Industrial Revolution, is that it is not the lowest-trained workers or the highly-specialized workers—if you think about artisanal-type workers back in the day—that get displaced out of their roles, and, through automation, replaced by machines in the Industrial Revolution, or here by technology and the AI paradigm.
I think with the current paradigm and what’s tricky is that the middle class and the upper middle class gets impacted as much as the less-trained, low-skilled workers. There will be medical doctors, there will be attorneys, there will be highly-educated parts of the workforce where their jobs—some of the jobs may be done away with—in large part, will be redefined. And very analogous to the discussion we were just having about see a shortage in machine learning experts, we’ll see older generations who are still seeking to be active members of the workforce be are put out of the labor market, or are no longer qualified and require new training, and it will be a challenge for them to gain the training to be as high of a performer as someone who has been learning the particular skill that’s in medicine in an AI paradigm from the get-go.
I think we’ll see a shift in job definitions, and a displacement of meaningful chunks of the highly-trained workforce, and that will have significant societal consequences as well as economic consequences. Which is why I think a form of guaranteed basic income is a worthy discussion, at least until that generation of workers get settled and the new labor force that’s highly-trained in an AI-type of paradigm comes into play.
I also think there will be many, many, many new jobs and professions that will be created that we have yet to think about or even imagine as a result. I do not think that AI is a net negative in terms of creating entire unemployment or lower employment. It’s not a net negative. I think—McKenzie and many, many others have done studies on this—in the long term, we’ll probably see more employment than not created as a result of AI. But, at any point in time, as we look at the AI disruption and adoption over the next few decades, I think we will see moments of pain and meaningful pain.
That’s really interesting because, in the United States, as an example, since the industrial revolution, unemployment has been between five and nine percent, without fail five and nine percent, except the Great Depression which nobody said was caused by technology. If you think about an assembly line, an assembly line is AI. If you were making cars one at a time in a garage, and then all of a sudden, Henry Ford shows up and he makes them a hundred at a time and sells them for a tenth the price and they’re better, that has got to be like, “Oh my gosh, this AI, this technology just really upset this enormous amount of people,” and yet you never see unemployment go above nine percent in this country.
I will leave the predictions of the magnitude of the impact to the macroeconomists; I will focus on startups. But I do think, let me stick with that example, so have artisanal shops and sewing by hand, and then the machine comes along and the factory line, and now it’s all automated, and you and others are displaced. So, for every ten of you who were working, one is now on the factory line and nine are finding themselves out of a position. That was the paradigm I was describing a minute ago with doctors and lawyers and other professions, that a lot of their function will become automated or replaced by AI. But then, it’s also the case that now their children or their grandchildren are studying outer space, or are going into astronomy and other fields that we might have, at a folklore level, thought about, but never expected that we’d get there; so, new fields emerge.
The pain will be felt, though. What do you do with the nine out of ten who are, right there and then, out of a position? In the long term, in an AI paradigm, we’ll see many, many more professions get created. It’s just about where you get caught in the cycle.
It’s true. In ’95, you never would have thought, “If you just connect a bunch of computers together with a common protocol and make the web, you’re going to have Google and eBay and Etsy.”
Let’s talk about startups for a minute. You see a lot of proposals, and then you make investments, and then you help companies along. What would you say are the most common mistakes that you’re seeing startups make, and do you have general advice for portfolio companies?
Well, my portfolio companies get the advice in real time, but I think, especially for AI companies—to go back to how you opened this discussion, which was referencing a byline I had done for Gigaom—if a company truly does have artificial intelligence, show it. And it’s pretty easy to show. You show how your product leverages various learning techniques, you show who the people on your team are that are focusing on machine learning, but also how also you, the founder, whether you are a technical founder or not, understands the underpinnings of AI and of machine learning. I think that’s critical.
So many companies, they’re calling themselves something-something-dot-AI and it’s very, very similar and analogous to what we saw with big data. If you remember, seven to ten years ago, every company was big data. Every company is now AI, because it’s the hot buzzword. So, rising above the noise while taking advantage of the wave is important, but meaningfully so because it’s valuable to your business, and because, from the get-go, you’re taking advantage of machine learning and AI not because it’s the buzzword of the day that you think might get you money. The matter of fact is for those of us who live and breathe AI and startups, we’ll cut through the noise fairly quickly, and pattern recognition and the number of deals we see in any given week is such that the true AI capabilities will stand out. That’s one piece.
I do think, also, that for the companies and founders that truly are leveraging neural net, truly are getting the software or hardware—whatever their product might be—to outperform; the dynamics within the companies have changed. Because we don’t just have the technology team consisting of the developers with the link to the product people; we now have this third leg, the machine learning or the data scientist people. So, how is the product roadmap being driven, is it the product people driving it, or is the machine learning talent coming up with models to help support it, or are they driving it, and product is turning it into a roadmap, and technology, the developers, are implementing it? It’s a whole new dichotomy among these various groups.
There’s a school of thought, in fact, that says, “Machine learning experts, who’s that? It’s the developers who will have machine learning expertise, they will be the same people.” I don’t share the view. I think developers will have some level of fluency in machine learning AI, but I think we will have distinct talent around it. So, getting the culture right amongst those groups makes a very, very big difference to the outcome. I think it’s still in the making, to be honest.
This may be an unanswerable question, because it’s too vague.
Lucky me.
I know.
Go ahead.
Two business plans come across your desk, and one of them is a company that says, “We have access to data that nobody else has, and we can use this data to learn how to do something really well,” and the other one says, “We have algorithms that are so awesome that they can do stuff that nobody else knows how to do.” Which of those do you pick up and read first?
Let’s merge them. Ideally, you’d like to have both the algorithms, or the neural nets, and the data. If you really force me to pick one, I’ll pick the data. I think there are enough tools out there and there is enough TensorFlows or whatnot out there in the market and in open source, that I think you could probably work with those and build on top of them. Data becomes the big differentiator.
I think of data, Byron, today as we used to think of patents back in the day. The role of patents is an interesting topic because, with execution, they’ve taken second or third seat as a barrier to entry. But, back ten, fifteen years ago, patents mattered a lot more. I think data can give you that kind of barrier to entry and even more so. So, I pick data. It is an answerable question; I’ll pick big data.
Actually, my very next question was the role of patents in this world. Because doesn’t the world change so quickly, plus you have to disclose so much. Would you advise people to keep them as trade secrets? Or, just, how do you think that companies who develop a technology should protect and utilize it?
I think your question depends a bit on what facet of technology are we talking about. In the life sciences, they still matter quite a bit, which is an area that I don’t know as much about, for sure. I think, in technology, their role has diminished, although still relevant. I cannot think of a company that became big and a market leader because they had patents. I think they are an important facet, but it is not the make-all or break-all in terms of must-have. In my view, they are a nice to have.
I think where one pauses, is if their immediate competitor has a healthy body of patents, then you think a bit more about that. As far as the tradeoff between patents and trade secrets, I think there is a moment in time when one files a patent, especially if secrecy matters. At the end the day though—and this may be ironic given that we’re talking about artificial intelligence startups—much like any other facet of our lives, what matters is excellence of execution, and people. People can make or break you.
So, when you ask me about the various startups that I see, and talk about the business plans, I never think of them as “the business plan.” I always think of them in the context of, “Who are the founders? Who are the team members, the management team?” So, team first. Then, market timing for what they are going after, because you could have the right execution or the right product, but the wrong market timing. And then, of course, the question of what problem are they solving, and how are they taking advantage of AI. But, people matter. To come back to your question, patents are one more area that a startup can build defensibility but not the end-all and be-all by any stretch, and they have a diminished role, in fact.
How do you think startups have changed in the last five or ten years? Are they able to do more early? Or, are they demographically different—are they younger or older? How do you think the ecosystem evolves in a world where we have all these amazing platforms that you can access for free?
I think we’ve seen a shift. Earlier, you referenced the web, and with the emergence of the web, back in 1989, we saw digital and e-commerce and martech; and entire new markets get created. In that world—what I’ll call not just pure technology businesses, but tech-enabled businesses—we saw a shift both in younger demographics and startups founded by younger entrepreneurs, but also more diversity in terms of gender and background as well, in that not everybody needed to have a computer science degree or an engineering degree to be able to launch a tech or a tech-enabled company.
I think that became even more prevalent and emphasized in the more recent wave that we’re just on the completion side of with social-mobile. I mean, the apps, that universe and ecosystem, it’s two twenty-year-olds, right? It’s not the gray-headed three-time entrepreneur. So, we absolutely saw a demographic shift. In this AI paradigm, I think we’ll see a healthy mixture. We’ll see the researcher and the true machine learning expert who’s not quite twenty but not quite forty either, so, a bit more maturity. And then we’ll see the very young cofounder or the very experienced cofounder. I think we’ll see a mix of demographics and age groups, which is the best. Again, we’re in a business of diversity of thought and creativity. We’re looking for that person who’s taking advantage of the tools and innovation and what’s out there to reimagine the world and deliver a new experience or product.
I was thinking it’s a great time to be a university professor in these topics because, all of a sudden, they are finding themselves courted right and left because they have long-term deep knowledge in what everyone is trying to catch up on.
I would agree, but keep in mind that there is quite a bit of a chasm between teaching a topic and actually commercializing, in that regard. So I think the professors who are able to cross the chasm—not to sound too Geoffrey Moore-ish—are the ones, that, yes, they’re in the right field and in the right moment in time. Otherwise, their students, the talent that is knowledgeable enough, those PhDs that don’t go into academia, but are actually going into commercialization, execution, and implementation; that’s the talent that we’re in high demand for.
My last question is, kind of, how big can this be? If you’re a salesperson, and you have a bunch of leads, you can just use your gut, and pick one, and work that one, or you have data that informs you and makes you better. If you’re an HR person, you hire people more suited to the job than you would have before. If you’re a CEO, you make better decisions about something. If you’re a driver, you can get to the place quicker. I mean, when you add all of that up across an entire world of inefficiency… So, you kind of imagine this world where, on one end of the spectrum, we all just kind of stumble through life like drunken sailors on shore leave, randomly making decisions based on how we feel; and then you think of this other world where we have all of this data, and it’s all informed, and we make the best decisions all the time. Where do you think we are? Are we way over at the wandering around, and this this is going to get us over to the other side? How big of an impact is this? Could artificial intelligence double GNP in the United States? How would you say how big can it be?
Fortunately, or unfortunately, I don’t know, but I don’t think we live in a binary world. I think, like everything else, it’s going to be a matter of shades. I think we’ve driven productivity and efficiency, historically, to entirely new levels, but I don’t think we have any more free time, because we find other ways to occupy ourselves even in our roles. We have mobile phones now, we have—from a legacy perspective—laptops, computers, and whatnot; yet, somehow, I don’t find myself vacationing on the beach. Quite the contrary, I’m more swamped than ever.
I think we have to be careful about—if I understood your question correctly—transplanting technology into, “Oh, it will take care of everything and we’ll just kind of float around a bit dumber, a bit freer, and whatnot.” I think we’ll find different ways to reshape societal norms, not in a bad way, but in a, “What constitutes work?” way, and possibly explore new areas that we didn’t think were possible before.
I think it’s not necessarily about gaining efficiency, but I think we will use that time, not in an unproductive or leisurely way, but to explore other markets, other facets of life that we may or may not have imagined. I’m sorry for giving you such a high-level answer, and not making it more concrete. I think productivity from technology has been something that’s been, as you well know, very hard to measure. We know, anecdotally, that it’s had an impact on measured activity, but there are entire groups of macroeconomists, who, not only can they not measure it, but they don’t believe it has improved productivity.
It will have a fundamental transformative impact, whether we’re able to measure it—I know you defined it as GNP, but I’m defining it from a productivity point of view—or not remains to be seen. Some would argue, that it’s not productive, but I would throw the thought out there, that traditional methodologies of measuring productivity do not account for technological impact. Maybe we need to look at how we’re defining productivity. I don’t know if I answered your question.
That’s good. The idea that technology hasn’t increased our standard of living, I don’t think… I live a much more leisurely life than my great grandparents, not because I work any harder than them, but because I have technology in my life, and because I use that technology to make me more productive. I know the stuff you’re referring to where it’s like, “We’ve got all these computers in the office and worker productivity doesn’t seem to just be shooting through the roof.” I don’t know. Let’s leave it there.
Actually, I do have a final question. You said you have a four-year-old daughter, are you optimistic overall about the world she’s going to grow up in with these technologies?
My gosh! We’re going into a shrink session.
No, I mean are you an optimist or a pessimist about the future?
Apparently, I’ve just learned—in the spirit of sharing information with you and all your listeners—that my age group falls into something called the Xennial where we are very cynical like Generation X, but also optimists like the Millennials. I’m not sure what to make of that. I would call it an interesting hybrid.
I am very optimistic about my daughter’s future, though. I think of it as, today’s twentysomethings are digital natives, and today’s ten-year-olds and later are mobile natives. My daughter is going to be an AI native, and what an amazing moment in time for her to be living in this world. The opportunities she will have and the world she will explore on this planet and beyond, I think, will be fascinating. I do hope that somewhere in the process, we manage to find a bit more peace, and not destroy each other. But, short of that, I think I’m quite optimistic about the future that lies ahead.
Alrighty, well let’s leave it at that. I want to thank you for an absolutely fascinating hour. We touched on so many things and I just thank you for taking the time.
My pleasure. Thanks again for having me.
Byron explores issues around artificial intelligence and conscious computers in his upcoming book The Fourth Age, to be published in April by Atria, an imprint of Simon & Schuster. Pre-order a copy here.

How Biology is Inspiring the Next Generation of Cybersecurity

Your average security operations center is a very busy place. Analysts sit in rows, staring intently at computer monitors. Cybersecurity alerts tick past onscreen—an average of 10,000 each day. Somehow, the analysts must decide, in seconds, which of these are false alarms, and which might be the next Target hack. Which should be ignored, and which should send them running to the phone to wake up the CIO in the middle of the night.
It’s a difficult job.
The alerts are false alarms the vast majority of the time. Cybersecurity tools have been notoriously bad at separating the signal from the noise. That’s no surprise, since the malware used by hackers is constantly mutating and evolving, just like a living thing. The static signatures that antivirus software uses to detect them are outdated almost as soon as they are released.
The problem is that this knowledge can cause a kind of numbness—and make tech teams slow to act when cybersecurity software does uncover a real threat (a problem that may have contributed to the Target debacle).
Luckily, a few government labs are experimenting with a new approach—one that starts with taking the “living” nature of malware a little more seriously. Meet the new generation of biology-inspired cybersecurity.
Sequencing Malware DNA
The big problem with signature-based threat detection is that even tiny mutations in malware can fool it. Hackers can repackage the same code again and again with only a few small tweaks to change its signature. The process can even be automated. This makes hacking computers cheap, fast, and easy—much more so than defending them.
Margaret Lospinuso, a researcher at Johns Hopkins University’s Applied Physics Laboratory (JHUAPL), was pondering this problem a few years ago when she had a brainstorm. A computer scientist with a lifelong interest in biology, she was aware that programs for matching DNA sequences often had to ignore small discrepancies like this, too. What if she could create a kind of DNA for malware—and then train a computer to read it?
DNA maps out plans for complex proteins using only four letters. But CodeDNA uses a much longer alphabet to represent computer code. Each chunk of code is assigned a “letter” depending on its function—for example, a letter A might represent code that opens a certain type of file, while a letter B might represent code that opens a server connection. Once a suspicious computer program is translated into this type of “DNA,” Lospinuso’s software can then compare to the DNA of known malware to see if there are similarities.
It’s a “lossy technique,” says Lospinuso—some of the detail gets scrubbed out in translation. However, that loss of detail makes it easier for CodeDNA to identify similarities between different samples of code, Lospinuso says. “Up close, a stealth bomber and a jumbo jet look pretty different. But in the distance, where details are indistinct, they both just look like planes.”
The resulting technique drastically cuts down on the time analysts need to sort and categorize data. According to one commercial cybersecurity analyst, the similarities CodeDNA found in two minutes would have saved him two weeks of hard work. But the biggest advantage of CodeDNA  is that it won’t be fooled by small tweaks to existing code. Instead of simply repackaging old malware, hackers to build new versions from scratch if they want to escape detection. That makes hacking vastly more time-consuming, expensive, and difficult—exactly how it should be.
How to Build a Cyber-Protein
Lospinuso’s team built CodeDNA’s software from scratch, too; it’s different from standard DNA-matching software, even though they implement the same basic techniques. Not so with MLSTONES, a technology developed at Pacific Northwest National Laboratory (PNNL). MLSTONES is essentially a tricked-out version of pBLAST, a public-source software program for deciphering protein sequences. Proteins are constructed from combinations of 20 amino acids, giving their “alphabet” more complexity than DNA’s 4-letter one. “That’s ideal for modeling computer code,” said project lead Elena Peterson.
MLSTONES originally had nothing to do with cybersecurity. It started out as an attempt to speed up pBLAST itself using high-performance computing techniques. “Then we started to think: what if the thing we were analyzing wasn’t a protein, but something else?” Peterson said.
The MLSTONES team got a bit of encouragement early on when their algorithm successfully categorized a previously unknown virus that standard anti-virus software couldn’t identify. “When we presented [it] to US-CERT, the United States Computer Emergency Readiness Team, they confirmed it was a previously unidentified variant of a Trojan. They even let us name it,” Peterson said. “That was the tipping point for us to continue our research.”
Peterson says she is proud of how close MLSTONES remains to its bioinformatics roots. The final version of the program still uses the same database search algorithm that is at the heart of pBLAST, but strips out some chemistry and biology bias in the pBLAST software. “If the letter A means something in chemistry, it has to not mean that anymore,” Peterson says. This agnostic approach also makes MLSTONES extremely flexible, so it can be adapted to uses beyond just tracking malware. A version called LINEBACKER, for instance, applies similar techniques to identify abnormal patterns in network traffic, another key indicator of cyber threats.
A Solution to Mutant Malware
Cyberattacks are growing faster, cheaper, and more sophisticated. But all too often, the software that stops them isn’t. To secure our data and defend our networks, we need security solutions that adapt as fast as threats do, catching mutated malware that most current methods would miss. The biology-based approach of CodeDNA and MLSTONES isn’t just a step in the right direction here—it’s a huge leap. And with luck, they will soon be available to protect the networks we all rely upon..

With contribution by Nathalie Lagerfeld of Hippo Reads.

Eastwind leaves stealth to help companies respond to cyberattacks

Security tools are only useful if their warnings are heeded. Yet one of the culprits behind the infamous 2013 data breach at Target was the company’s decision to ignore its own alert system. The result: Tens of millions in fines, the compromise of 40 million shoppers’ credit card data, and the departure of its chief executive.
Eastwind Breach Detection is emerging from stealth with a software-as-a-service tool that does its best not to be ignored. The post-breach detection software will send alerts to anyone who’s supposed to receive them — incident responders, a company’s leadership team, the IT department — until the problem is addressed.
“The interesting thing about [breaches at Home Depot, OPM, and Target] is that the alerts fired just became part of the noise,” says chief executive Paul Kraus. “If our systems don’t see a change in behavior we’ll alert again. And we send out an insight report the day the breach is identified and then again every week after.”
Eastwind also offers context around the breach. Instead of holding information for a few days before trashing it, the company monitors its customers’ data for 200 days to offer an idea of what happened before, during, and after a breach. This data is then collected and shown in the weekly reports sent to its customers.
It’s a bit like marrying your high school sweetheart: This person knows what happened before any problems occurred, watched them take place, and will presumably be around to make sure the issue is taken care of. (Trust me on this one.) Eastwind is meant to remain constantly vigilant, and its memory is long.
The company has other features that are supposed to differentiate it from its competitors, including a mobile application people might actually want to use; a service that can operate on Eastwind’s cloud or other tools like Amazon Web Services; and the ability to detect when a breacher has stolen any information.
But perhaps Eastwind’s greatest strength is that it was built to make it so anyone could use it. “I’ve had the opportunity to sit with [leaders of] Fortune 100 companies that have said, ‘I’ve taken the traditional security solution and give it to really smart guys to analyze,” Kraus said. “It hurt me to think that a Fortune 100 company would have a monopoly on smart people, or that the problem was so complicated that only PhDs from Stanford or PhDs from MIT could solve it.”
Eastwind is Kraus’ response to that concern. Its mobile app is designed to be easy for anyone to learn about the health of their company’s network. Its team was assembled to be the “really smart guys” behind the service obviating really smart guys. And the company’s reports are meant to do the thinking for users.
All together, this means Eastwind isn’t going to forget anything that might help it detect a breach, and it won’t stop warning its customers about the issue until it’s been resolved. Maybe these features will be enough to convince the companies responsible for millions of people’s private data to heed alerts about a threat.

For cyber security, machine learning offers hope beyond the hype

As businesses wind down for the holiday period, they’ll need to keep their cyber defenses up. While executives are tucking into their dinners, hackers will be trying to tuck into their businesses’ data. High profile breaches this year at organizations ranging from Anthem Healthcare to Ashley Madison and the US government’s Office of Personnel Management are a reminder of the threats that lurk online. And they raise the question of whether the cyber security industry can come up with a powerful new tool to frustrate the bad guys.
There’s been plenty of discussion at security conferences about the impact that machine learning will have on the cyber landscape. A subset of artificial intelligence, it involves the use of powerful algorithms that spot patterns and relationships from historical data, and get better over time at making predictions about brand new data sets based on this experience. Companies such as Amazon and Netflix use machine learning to help drive their recommendation engines, and banks and other financial institutions have long used it to tackle credit card fraud.
Now, we are starting to see some cyber security firms offering solutions that involve a machine-learning component. Huntsman Security, which counts intelligence agencies amongst its clients, recently announced what it claims is the security industry’s “first machine-based threat verification technology” that uses machine-learning algorithms to help analysts spot serious threats swiftly and take corrective action. Startups such as Cylance, Palerra and Darktrace are also employing machine-learning techniques in their services. [Disclosure: Wing Venture Capital is an investor in Palerra.]
It’s tempting to portray machine learning as a silver bullet that can be used not just to wipe out hackers, but also to wipe out jobs, too, by automating tasks performed by expensive personnel. This has provoked a backlash from some commentators, who have warned companies not to waste money on an unproven technology, and encouraged them to invest more in security teams and other tools instead.
However, that critique is based on a false claim about the technology’s potential — and a false dichotomy between human and machine.
Let’s take the issue of efficacy first. Machine-learning models work best when they can “train” on large volumes of data. Thanks to the rise of big data and extremely cheap storage, it’s now possible to feed vast amounts of information into models, which greatly improves their ability to detect suspicious activity. The goal is to distinguish anomalous behavior in things such as network traffic that might indicate a breach while minimizing false alerts (or “false positives” to use the industry’s terminology).
There are certainly challenges to be overcome. Algorithms are only as good as the quality and quantity of the data they are trained on, and data sets on the most sophisticated kinds of attacks mounted by nation-state actors (or their proxies) are still relatively thin. Sophisticated hackers can also try to fool models by employing tactics that seek to convince them that malicious activity is in fact legitimate.
In spite of such caveats, the machine-learning approach is still a great asset in a defensive arsenal. Given the volumes of data that security teams now have to deal with, adopting a more automated approach to querying network traffic and looking for anomalies that are not detected by traditional, signature-based systems makes sense. For instance, an analyst who has threat intelligence which suggests a network may be subject to a particular kind of data exfiltration attack could task a machine-learning model to look for telltale signs of this. Models can also provide analysts with other valuable insights, such as correlations between suspicious events.
To minimize false positives, many models rely not just on “unsupervised learning”, which involves crunching data to spot patterns themselves, but also on customer-driven, “supervised” learning. This can take the form of specific security policies, such as one that requires an alert to be issued if a bunch of sensitive files are suddenly sent to a new location. It can also involve analysts giving a digital thumbs-up or thumbs-down to alerts issued. Over time, this training can help a model to identify what really matters to an organization and reduce the risk of false alerts.
Will human trainers ultimately be displaced by the “machines” they teach? Some companies may use machine-learning as an excuse to downsize, but I think they’ll be the exception rather than the rule. When I speak to chief information security officers, I often hear that they are concerned about a worrying shortage of skilled cyber personnel. By putting machine-learning models to work in support of existing staff, security leaders can boost productivity and free up their teams to work on the most pressing and strategic issues.
There is another consideration that might resonate at this time of year. Algorithms don’t need to take a holiday, so they can keep on working while some of their human masters are taking a well-deserved break!
Martin Giles is a partner at Wing Venture Capital (@Wing_VC). He was previously a journalist with The Economist.

Where is enterprise infrastructure headed in 2015?

The enterprise industry is another year older … and hopefully somewhat wiser. Here’s what enterprise watchers should expect to see in 2015.

More cyber attacks

Sadly, this is an easy one. As bad as 2014 has been, and it has been bad, we’re just seeing the tip of the iceberg. Given the steady increase in value going through our systems (credit cards numbers, personal information, IP), organized crime and nation-sponsored attacks will continue to rise in quantity and sophistication. The current approaches to security clearly aren’t cutting it, which is why the security space is one of my biggest personal focus areas. (Full disclosure I’m an investor in Illumio, Menlo Security and ThreatStream, all companies in this space.)

AWS pushes further into the enterprise

Almost every startup that we fund today is using [company]Amazon[/company] Web Services, but it’s interesting to see AWS creep further into the medium and larger companies that dominate IT spending. At this year’s AWS Re:Invent, there were lots of compelling enterprise anecdotes, plenty of “all-in” stories, and, most importantly, the arrival of an ecosystem. There were startups and large companies alike announcing integrations with AWS with particular focus on the “-ities” — predictability, manageability, security, and availability. These are good signs of increased adoption in mainstream businesses where it’s now not “if” but “when” a company adopts cloud.

AWS: Reinvent

The rise of IaaS competitors

AWS continues with its strong lead, but 2014 also showed that there’s going to be a bigger fight than ever. In particular, [company]Google[/company] Compute Engine and [company]Microsoft[/company] Azure are rapidly improving their services and have the pocketbooks to fight this for the long term. Throw in Rackspace, IBM, vCloud Air, HP and the many other regional or vertically oriented offerings, and it’s going to be a major battle — with customers as the likely winners.

Containers get down to work

The rise of Docker has been one of the true success stories of 2014. However, it has also created a deluge of competitors (CoreOSRed HatUbuntu) and interesting co-opters (AmazonGoogle and VMware). Quite a year for a previously unheralded technology. The rise is real, but I believe that some of the hype will subside in 2015 as the some of the real work of making containers usable by enterprises begins in earnest.

Docker and the money container

Converged/hyper-converged infrastructure grabs the limelight

2014 saw continued excitement over “converged infrastructure,” pre-configured hardware/software bundles that are powerful and easy to adopt. Nutanix, VCE and Cisco UCS get most of the attention, but there’s lots of interesting competition on the way, especially as the latter two vendors update their relationship status to “It’s complicated.” Latest offerings include VMware’s EVO designs to new products from the big system vendors (Dell and HP are particularly aggressive). And I can personally attest to a slew of startups heading into this converged world with a variety of technologies and approaches.

APIs on the mind

I wrote about “mobile-first infrastructure” earlier this year and continue to think it will drive several longer-term infrastructure changes. In 2015, I think it will manifest itself most as the rise of APIs in enterprise development, as companies both produce and consume APIs like never before. Look for increased conversations, companies and challenges arising over this shift. (Full disclosure: I’m a backer of RunScope, which makes developer tools for this “API economy.”)

Network virtualization gets its legs

There has been much discussion of the arrival of software-defined networking (SDN). However, the term itself has been polluted to a point where it means different things to almost anyone you ask. I prefer the term network virtualization to speak more holistically about the advancement in separating out the logical network from the physical network. Cisco ACI and VMware NSX appear to have the lead, and 2014 saw significant movement from proof-of-concepts toward significant paid usage. Anecdotally, most of the adoption is in service providers, financial services and tech-heavy IT companies. 2015 should see further progress in the adoption, including by a broader set of consumers.

Big deals for big data

In 2014 there was nonstop talk about big data, analytics, and the opportunities and challenges of each. 2014 funding for companies has been unprecedented, ranging from Intel’s huge bet on Cloudera to substantial private investments in DataStax (driver of Cassandra)Databricks (driver of Spark)PlatforaAltiScaleDataGravity and numerous others. (My company, General Catalyst, invested in AltiScale and DataGravity.) Next year these companies will all focus on revenue — and we’ll see how the public markets respond to at least one Hadoop vendor, as Hortonworks is now a public company.

That’s one person’s cut at developments in enterprise infrastructure for 2015, and I’m sure I’ve omitted others that will be even bigger. That’s what is so fun about this space these days: We’re in a modern-day renaissance driven by the convergence of new technologies, new expectations and new challenges, all of which point toward more and bigger changes happening each year than may have taken place in prior decades. Here’s to the fun ride ahead.

Dr. Steve Herrod is a managing director at General Catalyst Partners and was CTO and senior vice president of R&D at VMware.

Note: This story was updated at 5:24 p.m. PST to correct the reference to Cisco ACI (application-centric infrastructure) not ACE.