People will give up their personal info if you give them a good reason

Session Name: Addressing The Tension Between Personalization And Privacy.
S1 Announcer S2 Phil Hendrix S3 Ken Chahine S4 Naveen Jain S5 David Shim S6 Audience Member 1 S7 Audience Member 2
This is going to be moderated by Phil Hendrix, he’s the director at IMMR and an analyst with GigaOM research and the speakers are Ken Chahine, SVP and GM of DNA at, Naveen Jane, founder and CEO of Inome, and David Shim, founder and CEO of Placed. Please welcome our next panel.
Good afternoon everyone. This is the hearty crowd that’s hanging in there until the very afternoon. Actually, there’s a session after this one as well. I think some of the workshops are still going on. We’ll go ahead and get started and I’m sure others will straggle in here. I’m Phil Hendrix with IMMR and also a GigaOM pro-analyst. Delighted to moderating a panel on personalization and privacy this afternoon. Important topic, we’ve had good discussions throughout the day, I’ve really enjoyed the program and I hope you have as well. We’re going to be talking about some technology, but also a lot of applications as well as some of the issues that arise around uses of big data. So, I’m going to ask my panelists to quickly introduce themselves and then we’ll jump right into things. Ken.
Thank you for the invitation. My name is Ken Chahine, I’m SVP and GM of Ancestry DNA, it’s a new service that Ancestry launched in May of last year where we analyze a lot of DNA data and give people a lot of information about their ethnicity, who they’re related to and hopefully we’ll have an opportunity to share a little bit of the big data challenges that we have.
My name is Naveen Jane. I’m the founder and CEO of Inome. In addition to that, I do all kinds of fun stuff, including a lunar mining operation called Moon Express and other things within the enterprise. The idea really is, if you can take all the information in the world and make it person centric and read a person’s information gene, information that’s as unique as your genes attached to you, what would happen? Every industry would potentially get interrupted and that’s our job.
Very good, thanks. Dave?
David Shim, founder and CEO of Placed and we can textualize locations, so we give you insights in terms of what is the most popular business in Seattle for men versus women and we break that down to give you insights in terms of web analytics, but for the physical world.
So, I want to start out with an anecdote. Years ago, I had a colleague who got his dream job, he was a native from Maine and he got invited to come back and work with LLB and one of the first initiatives he worked on was implementing called ID in their call center. So, he was relating the story to me, I was with a call firm at the time, and he said, ” Phil, we had this woman from the south who called and we recognized her based on caller ID and we said, ‘Hi, Mrs. Smith, how are you doing?’, and she said, ‘Oh, you recognized my voice?'” [chuckles] thinking perhaps not that many southerners ordered from LLB. This is a pretty sophisticated crowd, so it’s probably going to take a lot to impress them, but as you think about what we can do, even today, with big data and all of the tools that are available to us, what would impress this crowd in terms of our ability to identify, personalize, individualize, whatever terms you wish to use. Who wants to go first on that?
I’ll take it. So, if you want really impress, think about all the information that exists about you, and all that information that may exist in various different sources with no common key. So there may be an article written about you, an email or a place or, in fact, even a physical mail or a magazine delivered to you. The house you may have bought, or the crime you may have committed. All of those things over the map, can you put them together and assigned to you as an individual, not just a name but to you as a person. Imagine if you can do that with 20 or 30 or 100 billion records. And we’re talking just US alone, right. So as you start to see, by big data I mean big, big data. At the end of the day, big data really wants to become small, because the human mind can really look at small data. So our job is to take really, really big data and make it so small that we can comprehend it as a human being.
Alright, interesting example. Others?
I would say, when you look at your cell phone, it’s a persistent cookie. Think about everything’s that possible today in terms of the web, to understand what websites you go to, building personas, just based on – not knowing who you are – just based on the sites that you consume, your internet connection speed, the browser type that you have. Now take the phone instead and it has that much data, plus more. It actually knows what places you’ve gone in the physical world, when your next meeting is, where you’ve checked into. All those data points are available and I think what we’re going to see in the future down to the individual level but also in the aggregate is being able to quantify that. We just has one study where we looked at show rooming and we saw that females were more likely to Coles, men more likely to go to Best Buy. That’s something that just wasn’t available until that smart phone came and you could actually measure that data.
Bit stream data, in essence.
I would say that if you step back and think about it for a minute, we carry around ultimate big data with us every day. DNA carries three billion pieces of information that made each one of us. When you think about how little we know about the DNA that we’re carrying around with us, we’re still at the very early stage of trying to identify it. At Ancestry, we’re using that to figure out first and foremost, how you got here. You’ve got that DNA from your parents, your grandparents, your great-grandparents, and we’re trying to understand and piece all that information together to put sort of a map of how you actually got here.
And big tables. You gave me the dimensions before we came in. They are literally how many columns and…?
One of the things we do at Ancestry is we try to find potential relatives that you have. So you have on average, let’s say eight, first cousins but you may be surprised to know that everyone here has about four to five thousand fourth cousins, right? What we do is we compare every individual that goes onto our database against every other individual that’s taken the test. That problem is an n-squared problem that gets bigger and bigger and bigger. When the database gets to a million, we are going to have to find a different solution, because it’s getting pretty ugly.
If we think about the state of personalization at present – Jeffery Moore had an interesting article within the last couple of months and basically the title of it was Personalization is Balderdash”. Now, this probably has a certain meaning to each of you. In essence, he was saying Dont believe it.” He argued that at present, personalization is being used to improve the yield, by and large, on advertising from 99% wrong to 97% wrong. If you were to think about where we are and where we might get to in three to five years, give us an example of something that you think will be remarkable in terms of its impact on individuals’ lives or companies’ ability to really radically enhance or change the experience for the individuals they serve. So something that’s significant beyond advertising.
First of all, if you think about who we are as an individual, if you start to think about what’s happening in the pharmaceutical or the drug industry. Every time they do a drugs trial and it gets approved, that means it might have a 20% or 30% efficacy rate. Imagine that 20% or 30% efficacy rate – that means 70% of the time, it does absolutely nothing, so that means there are– if you can somehow not take that and individualize or personalize that to say it works 100% of the time for these people, and it works 0% for these people, imagine what happens. And there are three very interesting things starting to happen as you start to sequence not just the genome, the epigenome and even the micro biome. Because in our body there is ten times more foreign DNA in our own body than our own DNA. That means that one tenth of our own DNA and nine tenths is actually the microbes and the bacteria that lives inside our body and that actually is how we eat and digest and everything else that happens to us. So if you can start to sequence the genome, the epigenome, micro biome, proteome, it becomes a big data problem. In terms of drug testing, there’ll be digital models of that. You’ll be able to see if I’m suppressing this particular genetic sequence through this epigenome, what is the impact of that on the heart and the whole system, and in fact there are some very, very smart people even inside IBM that are working on the whole digital model of the heart and as you modify one thing, what is the whole system impact that happens.
That’s huge. That’s going to be hard to top. David? [laughter]. Anything that comes close to individualized medication that has 100% efficacy?
I’m not going to beat that, but I think what it’s going to come down to in a few years is that the view on privacy will change. Right now, everyone’s afraid of location, everyone’s afraid of DNA, everyone’s afraid of all these different things. I think as people start to understand that they can get a benefit from sharing this type of data, it will become more open. Just like cookies used to be the end of the world on the internet – I don’t want cookies, I’m going to delete this.” Now people are like Theyre going to track me, they’re going to show me stuff on the internet, and there’s a value exchange. I think we’re going to go down that same path in privacy where I can go in and I don’t mind sharing my data if I get something back in return.
Just as one anecdote. Imagine if you had seven days of the websites that you visited and then seven days of where you were in the physical world. Which one would you want to give up? People would typically immediately say location but then you look through the sites that you visited, and you might actually change your mind. I think that’s where the perception is going to change over time.
Big impact.
To me– I want to piggyback off what Dave just said. I think a lot of it is value exchange, and if you think about what Ancestry does, are people concerned about giving us their DNA to analyze it? Absolutely not, right. But our product is personalized. They give us their DNA so we can give them their results. I think you’ll all be pleased to know that when you give us your DNA sample we don’t give you a generic result. We actually analyze your personalized data and give you your information. I think David’s right that the value exchange is a very, very important part of it and as long as we keep that value exchange there, I think we will strike the right balance between privacy and that value.
I often do focus groups with members of my family, at least before doing more formal studies, but I read with interest the one billion dollar initiative that Disney is undertaking, My Magic Plus in which they’re using IFRD bracelets and then using that information to personalize the experience. For example, you walk up to a character and the character can greet you and your child by name, and ask how your experience was in the Haunted Mansion and so on. I’m testing this with my wife, and she says, No way.” You should also know that my wife also whenever she’s asked ” Would you like to share your location with this app?” says ” No,” so maybe she’s an extreme example. But study after study has shown that a fairly small percentage of individuals are willing to share various types of data with a variety of companies.
In fact, a study that we did that was sponsored by GigaOM Pro showed that more people were more willing to share their location data with a restaurant than they were with their bank. So, a question for you: what must companies do, David to your point, to gain customers’ cooperation in order to share more data that enables some of the things we’ve just described here. David, you’re working with location data, what are you doing?
We measure in excess of 70 million locations on a daily basis today. On a per user basis, we see about a thousand locations. They way that actually get that data is we actually have a triple opt-in, where someone installs an app, they say ” Yes, I want to give it location permissions,” and then we put in explicit language ” Yes, this is okay to measure me,” and then we have Terms of Service. What we do is we actually will give them points that they can redeem for gift cards, for prizes, even make donations to charity.
Is it a generational thing, where more younger consumers are opting in? Or do you see older consumers opting in as well?
I think you have to find the right fit. So it’s not going to work globally. The sweepstakes, the gift card, works for one group. We have another group where we make a donation to charity in exchange for that data, and that attracts a completely different group. I think in your example, your wife has to have the right kind of carrot, to say Hey, this is worthwhile.” It’s up to us as businesses to figure out what is that carrot.
So what works, in terms of getting cooperation?
In this particular case about your wife, I think she will probably care about the security of children. So if somebody were to say, as soon as they reach school it will automatically send me the location, the child has reached school, or if they were straying from the school more than 100 meters, it will alert you to let you know that they have actually strayed away. The point is very simple. While she cares about security, at that time the privacy’s out the window when it comes to security of the children. So I think there’s a very good value proposition there if somebody was to say, Give me the location of your child and I will alert you if the child is not at school by this time.”
So, compelling value propositions. Okay. Ken?
So, we’ve taken an approach where we allowed the customers to opt into giving us the data, what we can do, they can always withdraw their DNA sample from us and their data, so it’s completely transparent, it’s very open, it’s very clear what they can do and I think the one difference that we do that not everyone exercises that right, but everyone loves to have the option which is, we give them their data back. So, when you take a DNA test, if you want to download your raw data and use it for other purposes, you absolutely can. And I guess that I would say that very few people actually express an interest in that, but I think that we would all feel a little bit better if we knew that the data that companies were collecting were something that you could download and understand what was collected and how it was used.
So value and protection–
Ken, Im going to jump in just to make it entertaining. The point is, most of the time people don’t understand what it can be used for, so in some sense most people when looking at something that technical, they say, ” My DNA, why would anybody care about it”, until somebody says, ” Now, you can be part of drug testing”, because now your DNA can be used to see if there’s anything in you or not. Are they specifically giving permission for that? God knows what use that DNA will be. So, in some sentiment they’re giving permission. They really don’t know what they’re giving permission for.
But Naveen, your company is also using a wide variety of data–
Of course.
Including social and I came across a company that I thought was interesting, so let me test this one with you. They allow companies to detect tweets that meet certain parameters and those companies then to respond to the individuals who are posting those tweets and I was curious and again tested this with my daughter. How would you feel if a restaurant or whomever responded to your tweet or an insurance agent or what have you, so this question of how information that one posts or shares often is used in unintended ways. So, what are the boundaries and how do companies accommodate the unintended uses in ways that avoid the reaction, Oh my God”. Any thought on that one?
Again, to us, we recognize that the business has been going extremely well and I think the transparency that we give in terms of the data, I think is a real key aspect of what we do and I would say that our tendencies do become more and more transparent rather than less and there’s definitely, if you want to go to the example on medicine, there’s an enormous social good that’s going to come out of this and I think what people need to do is understand that you may not necessarily benefit specifically from something, but in 10 or 20 years, one of our relatives is going to get a text and it’s going to say, ” You’re on this drug and you shouldn’t be on this drug, get off of it immediately” and the only reason we’re able to get that data is because we analyze a hundred million people and we’re able to figure out who is going to have a side effect. So, again, it’s communication transparency is the key to it.
I think just to add to it, I think it doesn’t matter what industry you are in, you have to always look that your competition and what you’re doing is not going to be disrupted by people in the same industry. It’s going to come from completely outside that industry. So, the example we’ve been giving is how people in this room, big data people are going to impact the drug industry and there’s not a single person here from the pharmaceutical industry sitting here thinking about it. That just tells you that. What are the kind of things like manufacturing, it may be completely disrupted by 3D printing. So, when you’re able to even take your own gene and get printed on the other end, you literally have been transported from one place to another place over e-mail. What if you could send your sperm over e-mail to somebody else and print the sperm on the other end? So, imagine the consequences of that. It suddenly changes the game. So, I think the whole system suddenly changes what is going on in a 3D printing world where you are able to print it on demand right in your own home.
So, what are the technologies, and this is a technology crowd, so let’s speak to data platforms tools that are enabling and will enable this vision we just described. Give us a sense of what kinds of tools you’re using and anticipate using over the next 12 to 18 months that will enable us to accomplish some of these things you’ve been describing. Big data, new data, tools, algorithms, give us a sense.
Well, the thing is, big data has become such a buzz word, everybody has a different idea of what big data really means. So, the point is, yes, the large amount of data, but the large amount of data really does absolutely nothing unless you can make it comprehensible and start to integrate them together and then able to analyze them together. So, I think in terms of, it is about integration, it is about accuracy, it is about the difference between precision and recall. It is a difference in terms of understanding a prediction.
Do we have all the tools today that are needed?
Of course not.
So, what’s needed?
Well, all of those. The point is that there’s no way to know when you have stacks of data that belong to the same person. So, there are a lot of tools that are eventually going to be needed in terms of analytics, the tools that are going to be needed for prediction.
So, one of the companies that I’ve worked with in the past is Lexus Nexus and among the things their platform is designed to do is to disambiguate, I learned that term a couple a years ago, and really figure out is this John Smith that John Smith? So, this is one of the issues that you’re addressing. Other issues that are thorny, difficult that you hope someone in this audience or beyond will solve and will enable you to do even more and better things with the kinds of data that you’re working with?
Going back to ancestry, how do you go back and how–
Yeah. The big thing for us is that you have the 3 nucleus in every cell and we really don’t have a dictionary, we have no idea what each one of those actually says. So, a big part of the ancestry is yes, the genealogy, but a lot of it is to take that data to someone unstructured and create a dictionary that says when you see this, this is what it means. For us, that’s why I joined Ancestry, it’s the billions of records and the hundreds of millions of other nodes in people and trees and things that we have. We’re using both of those data sets to make sense of the DNA and that’s going to be extremely useful, not just in DNA but in other areas as well.
Okay. David?
I think the big this for us is just the hand set manufacturers out there. To actually get all the data points that are available. Big data makes it easy and cheap to store the data, to process the data, but it’s also up to the hand set manufacturer to give us different data point, so imagine with Wi-Fi, the intended purpose of Wi-Fi was never to actually triangulate where you are in the physical world. But all of a sudden all these sensor values started to come in and hey, we can actually get longitude and latitude when we use a Wi-Fi signal and all of a sudden it’s a core part of Google Maps, it’s a core part of when you’re inside of a mall and you’re looking around and you look at a map to see where you are. That’s the technology that is there and it wasn’t the intended purpose, but for us, it’s around getting the value out of this and getting more sensors.
Will NFC create new opportunities for a company like yours? Or new challenges, or both?
TBD. I don’t think NFC is there yet. I think right now it looks like the QSR codes at the end of the day, there’s not enough and it’s not the standard. So, we’re interested to see what happens, but we’re not necessarily betting on it.
So, who does a good job in the commercial arena today in terms of personalization? And I’ll take 2 off the map here. Netflix and Amazon, beyond those two, who in your opinion is doing a great job of taking these data and somehow using it in ways that create personal experiences or value in other ways? Can you give us any examples of companies in addition to your own?
Sure. We all saw the example of Target. They were able to in fact predict based on the buying behavior that the girl was pregnant, even before the father knew the girl was pregnant because she was buying un-fragranced lotion. So, there were able to start sending the promotional coupons for the baby and father was very angry about that. So, the point is, using the buying behavior, they were able to predict that this woman is pregnant and she’s going to have a baby.
Okay. All right. Target. Advanced retailer there. Give me a couple more. And what’s your sense in terms of the average company’s ability to take these data and use it in ways that are constructive for the consumers and the companies themselves?
I was just thinking of the example we were talking about in the ready room. You were talking about the ability to take someone’s gate and be able to use the way they run or walk to identify. I can easily envision seeing a video of someone finishing a race or triathlon and sending them the right shoe with the right correction. So, there’s really almost no end. I think this is really just the beginning. We say that a lot, but it truly is and it’s really a matter of finding data, refining it and coming up with predictive algorithms that are actually accurate and that just takes a lot of time.
So, Nike is probably a great example of a company who has access to that data and I don’t know if anyone from Nike is here and I’m curious if they’re using that in ways beyond descriptive and feedback and so on. David, any other examples come to mind?
I think Quantified. You’ve got a fuel band on there, I think this is just the starting point where as you start to get more of those data points, you’re going to start to get customization. Wouldn’t you like to get the perfect shoe based on your walking habits or your running habits? Being able to get that type of information, or Hey, you should run here because this is the trail that people who do the same pace like to go at”, so I think that’s going to be an interesting opportunity.
The one thing I would say to Caution and we’re also starting to see this with the DNA, you also have to be careful. With more data there’s more opportunity to make mistakes. So, I would say that one of the things that’s really important is that as we go down this path that we create what we call truth sets to make sure that as you progress and analyze the data that you’re not going down the wrong path. We’ve spent a long time at Ancestry initially trying to find out if we predicted two people are relatives, are they in fact relatives or not? The way to really do that was to take vetted pedigrees and confirm yes, fourth cousin in DNA, fourth cousin in pedigree. So, that rigor is going to be something that’s important because you may really go down the wrong path only to find out much later that it’s not really personalized, it’s wrong.
So, Ken, is 6 degrees of separation, does that actually exist?
That’s another panel, but I’ll tell you, we are a lot more related than everyone thinks. So, take a test and you’ll find out.
Another thing I was going to say on the big data, one of the biggest problems we see is that a lot of the people get confused between the coalition and position. That’s one of the biggest things we all have to be extremely careful when analyzing big data that to not get confused.
So, if you have a question, please come up to the mic. I’ve got one more big question and that is, in some categories like mobile commerce and mobile payments, I actually had a client last week in a category who said, ” We’re really not that concerned with what Google may do”, and I challenged them on it a bit because my sense is that Google is looking at all of your data and perhaps, given their tools and capabilities is probably in the best position to do some of the things that we’ve talked about today across categories. So, if you’re not worried about Google, you should be. So, I’m curious as you think about the average company and their ability to do these things, and the potential they may be disrupted by a competitor who’s demonstrated these capabilities, what’s your sense of the threat versus opportunity in categories like financial services, payments, perhaps medicine, health care and the like. Do you see the encumbrance seizing the opportunity, or disrupters basically mediating?
In every industry you see that the both happen that experts are very good at implementing evolution. Most disruption comes from non-experts. Once you become an expert in a field, you can essentially come up with implemental evolution, but disruptive ideas are very hard to come by for the same reasons why you’re good at them. So, I think just because a large company has lots of assets, not necessarily able to find out how to use those assets for a disruptive idea. History is full of it. You can go look at IBM. IBM is one of the few companies that actually have survived.
So, more of a threat based on experience.
David, what’s your sense?
I think to your point, there’s a lot of historical examples where the incumbent doesn’t do very well, there’s Google Plus, Microsoft, etc., you can go down that list so it’s not really a concern. Our focus is just to go in and say, This is a problem that no one else is solving today, we think we can do it the best, so let’s go at it”, and we’re not worried about the competition at the end of the day.
Okay. Ken?
I would agree. I think we’re going to find new uses for all technologies in ways that the incumbents are not going to see and I think in some cases they’re going to be extremely valuable and very disruptive. So, I would bet that the pattern is going to continue.
Okay. Do you have a question? Yes?
So, my question is why are personalization and privacy always positioned as being an opposition?
Good question.
Because to me, it seems one of the things that’s really missing is the idea of giving end users the tools to personalize their own experience and their own search results or their own whatever, and that’s not done. I was out presenting at Yahoo and I was talking to Tim Parcy out there and they’re all about personalization and when I said, Why wouldn’t you just give users tools to personalize search when they feel like it?”
So, personalization can be personalized. Good point.
It’s a radical thought.
That goes with my point. We give the end user the ability to download their raw data so they can analyze it in ways that, quite frankly, even for Ancestry isn’t interested. There may be a certain DNA sequence that runs in your family that you’re interested in specifically that maybe isn’t interesting to Ancestry. So, we completely agree to giving those tools back and letting people crowd source their own solutions.
Other questions? Yeah? Speak up. You might go to the mic real quick. We’ve got 10 seconds, but run to the mic. Make your question short and we’ll need a quick response as well.
Personalization and privacy are separated by a thin line. With this invention of big data as a solution, is the industry prepared for signing up for some kind of regulation that invades the privacy of people? It may be for selling of data or acquiring some information.
Are you optimistic that industries can self-regulate and avoid issues that might disrupt this whole phenomenon? Or are we expecting regulators to step in and manage that.
That’s the same thing that happened to the financial service industry, it’s going to happen to this industry too.
Quick answer?
Government is the problem, not the solution.
We’ve started to regulate, if you will, through contractual T’s and C’s what people can do with data and things like that. So, we’re not waiting for the government, we’re contractually dealing with the customer and trying to protect them that way.
Now, we’re moving forward with a very aggressive plan in terms of getting opted and making sure that we’re above and beyond what else is going to come down the pipeline.
All right, we’re out of time. Please join me in thanking the panelists.