Why data should be our guiding light on public policy

With the advent of open data and new, powerful methods for analyzing it, we’re learning a lot that could challenge longstanding beliefs on public policy.¬†Politicians, social workers and other civil servants have always had data, of course; they just never had as much and could never do with it what they can today. They should listen to what the computers tell them.

What’s possible

Recent HIV research from Brown University is a great example of what’s possible. Researchers formulated a computer model based on numerous factors relating to drug use, sexual activity and the medical aspects of HIV infection. To ensure it was accurate, they calibrated the model until it could accurately reproduce known HIV infection rates in New York City from 1992 until 2002. They ran the model thousands of times on a supercomputer.

Credit: Brandon Marshall/Brown University

They found that the rate of of HIV infection among New York City injection drug users will be 2.1 in 1,000 by 2040 if current programs are left in place. Expanding needle exchange programs will decrease that rate by 34 percent; expanding HIV testing would only result in a 12 percent reduction. However, a comprehensive approach that includes these two programs as well as two others regarding the administration of medicine and antiretroviral therapy would drop the rate by more than 60 percent to .8 per 1,000.
Assuming their model is accurate, that’s a significant reduction — getting HIV rates among drug injectors near zero — and it’s all thanks to access to lots of data and lots of computing power. Recently, another group of researchers in Europe developed a computer model that found a strong correlation between web censorship and high violence rates during times of social unrest — a timely finding given the current state of world affairs.
Last week, I explained how Xerox is working to help Los Angeles and other cities get a better view of their traffic so they can try to make life more efficient and less congested for citizens, while simultaneously reducing pollution and optimizing budgetary resources. To achieve these goals, Xerox and other companies in this space are gathering data from everywhere — cars, mass-transit systems, traffic sensors, cell phones, weather databases — and developing complex machine learning models to determine how everything is connected.
Of course, these are just a handful of examples of what researchers and others are working on with regard to data. Pick an area of public concern — climate change, smart grid, crime rates, genetics, whatever —¬† and you’ll find someone with mountains of data running some seriously complex algorithms to make sense of it.

Anyone can do it

However, as anyone who reads GigaOM regularly probably knows, decision-makers don’t need in-house supercomputers or data scientists on staff to inform their policies with data (although the latter wouldn’t be a bad idea). All they really need is an internet connection. Data sets are available everywhere you look, including at data marketplaces such as Factual and Infochimps, at Data.gov, and even increasingly on news sites such as the Guardian (see disclosure). Thanks to cloud computing, the resources necessary to analyze this data are cheap and plentiful.
And with increasingly prevalent cloud services targeting low- to mid-level users who want to run some relatively simple analyses, there’s no excuse for politicians and others not to inform their decisions with — nay, base them on — data. Last week, with company at my house and two toddlers running around, I was able to sit down with my laptop and generate a predictive model for gun-related homicide rates using a service called BigML and data from the Guardian‘s Datablog. It’s nowhere near Brown’s model, but I was able to do it while sitting on my couch.
Lazy politicians need not even get their hands dirty with raw data because chances are some journalist or bureaucrat has already analyzed it for them. Data on gun ownership in the United States versus the rest of the world is everywhere this week, as is, already, data on the spike in gun sales after last week’s shootings in Colorado.
The Nevada state legislator I recently heard on the radio struggling to defend his proposed tax on junk food would have benefited from reading this study from the USDA. It’s the top result on Google when searching “junk food cheaper than healthy food.” There’s also this interesting study on the effectiveness of Mayor Bloomberg’s giant soda ban in New York.

Why we should listen to the data

Look at the state of the world right now. Droughts, deficits, civil wars, obesity epidemics. A skeptic would argue that the old methods of public policy decision-making, driven largely by political and economic concerns, haven’t worked out too well. Why not give data a chance to take the lead? In the wake of the great recession, smart businesses certainly have.
It’s a simple proposition: Choose an important issue, find relevant data on it, analyze the data (or trust someone else’s analysis), and go from there. It’s objective starting viewpoint about whether something might actually work, political pressures be damned. Who knows, a brave politician who plants a stake not on the left or the right, but with data analysis, might end up looking like a hero in the end.
Disclosure: Guardian News and Media Ltd., the parent company of the Guardian newspaper, is an investor in the parent company of this blog, Giga Omni Media.
Feature image courtesy of Shutterstock user MikeE.