10 ways big data changes everything

Can cell phone data cure society’s ills?

By Kevin Fitchard
What can cell phone bill data really tell you? Apart from explaining how much you owe your cellular provider, it harbors contextual information, which most of the time isn’t very useful to anyone but you. But a project developed by Nathan Eagle, a professor at Harvard’s School of Public Health and MIT Media Lab, is compiling millions of phone records to help development groups make important decisions like finding the best region in Kenya to launch a malaria eradication campaign or enabling health care providers to discover abnormal patterns in cholera outbreaks in Rwanda.
“Every time you receive a phone call or a text message, these events are all represented as rows of data in a database,” said Eagle. “That data is collected for the purpose of billing, but there is a profound ability to use this data for a lot of social good.” Eagle estimates there are over 5 petabytes of data generated every day by mobile phone subscribers around the world, though his group only has access to a small but growing fraction of it.
Eagle is working with carriers, governments, developmental organizations, and public health and welfare agencies around the world to mine cell phone billing records for meaningful data that can be used to solve some of the world’s biggest social problems. When someone makes a call, an operator records where the call was placed (through cell tower ID), to whom and its duration.
By extracting anonymous data points from millions of subscribers in both developing and developed markets, Eagle and other researchers are able to create enormous databases of information. Eagle says his group uses a lot of open-source versions of tools, like Python, though the group builds its own analytics tools that run on machines with 1 TB of RAM.
With the right algorithms and computing resources, the group can extrapolate movement patterns and behavioral data, which organizations like the World Bank and the United Nations can use to refine and even define their poverty and disease elimination programs. For instance, sudden anomalies in movement patterns from cell phone users in particular villages or regions in Rwanda could be used to detect disease outbreaks. Mobile phone data could be used to quantify the social dynamics of Nairobi’s slums and pick out patterns in the spread of HIV in the Dominican Republic.
In Kenya, Eagle worked with one public health organization with the ambitious aim of eradicating malaria on a particular stretch of the Indian Ocean coast. But after compiling and analyzing data from all of Kenya’s mobile carriers, Eagle determined that malaria simply couldn’t be controlled in that region. The reason: the high amount of mobility among the coast and other areas of the country.
“The dirty little secret of malaria is the human vector and the reintroduction of the parasite,” Eagle said. Even if you succeed in eliminating the parasite in the entire region’s population, people come and go, bringing the disease with them from other areas. Eagle’s analysis found that if that mobility is too high it would counteract any reasonable malaria eradication measure. “If the number of people [coming and going] reaches a certain threshold, then you cannot feasible eradicate the disease in an area,” he said.
Such data analysis is now helping antimalaria organizations target their resources in the most effective way possible.

People as particles

What does the phone data Eagle collects tell him? “We’re looking at how people move around, thinking about people as particles,” Eagle said.
Like particles, people tend to oscillate within predictable boundaries. Those oscillations can be used to detect the general movement patterns of a group or society. When a particular person’s radius of gyration suddenly changes, it means nothing. But when the oscillations for, say, an entire village or region changes, something big has happened. The key is determining what exactly has happened.
In Rwanda, Eagle tried to correlate the sudden changes of movements with cholera outbreaks. In one village, Eagle and his researchers suddenly noticed a sudden shrinking of the movement ranges of its residents. They thought they had predicted a cholera outbreak, but what they had actually detected was a flood caused by a broken dam, which had washed out local roads, greatly constraining the local populace’s movement.

The big picture

Eagle stressed that mobile phone billing data alone can’t predict anything. It’s an enormously useful data set, but to make broader inferences about the meaning of that behavior it must be combined with other data sets, ranging from census to weather and health databases and most importantly empirical reports: Nothing beats feet on the ground and eyes observing what they see, Eagle said.
Empirical information can be some of the hardest data to come by, since it involves actual people with clipboards — or their digital equivalents — collecting data. But Eagle is trying to tackle that problem as well. He founded a company called Jana (Jana means “people” in Sanskrit), which aims to use the mobile phone as a means to collect individual survey information as well as mass-crowdsourced data.
Jana uses the access Eagle has to 2.1 million mobile phones in Africa, Asia and Latin America to gather data for global companies and organizations. In exchange for answering survey questions those mobile customers get minutes of airtime, a valuable currency in developing countries. While Jana’s customers use the platform to collect marketing data, that same technology and methodology can be used to ask questions for the benefit of social programs. In many cases the information that marketers and behavioral scientists are seeking is the same.
So the next time you get your phone bill and see row after row of call data, you might think that level of detail is of little use. But know that scientists and entrepreneurs like Nathan Eagle are using that data to better society.