Why big data matters and data-ism doesn’t

There has been something of a data backlash happening lately, and I think I’ve figured out why: Data for the sake of data has a tendency to sanitize experiences we’d rather leave a little bit dirty. But there’s a big, meaningful difference that’s worth knowing between big data¬†and just plain data.
David Brooks’s recent column in the New York Times is a good example of this. He coined the term “data-ism” (which is quite apt) to describe a newfound penchant for reducing everything in our worlds into a number or statistic. Skeptical of this data worship, he is — rightfully — inclined to rebel.
But everything Brooks mentions in his article is really just statistics, the stuff academicians and businesspeople have been doing for years. It doesn’t take any revolutionary technological advances to measure the effect of political spending on campaign results or the idiosyncrasies in how a president speaks. At the best, these types of analyses are enlightening; at the worst, they’re overkill.

Data can be an unwelcome disinfectant

Like Brooks’ pointer to a study about whether there’s such a thing as hot hand in basketball. Or a recent debate (that got an incredible amount of undeserved digital ink from Deadspin and Nick Carr) about whether to adjust the points and frequency of Scrabble tiles based on what letters actually appear most in the English language. Right or wrong, who cares?
Unless you’re a professional gambler or in the sports business, sports are supposed to be fun; an escape from reality. Buying into things like hot hands, sweet spots, ancient curses and concern rays (thanks, Dave Barry) are part of the rooting experience. If it wasn’t for coaches’ insistence on punting on fourth down, I could watch an entire football game and not think about probabilites once.
As for Scrabble, well, it’s a game and it’s fun. People like it as it is. What’s next, lobbying to change the distribution of resource cards in Settlers of Catan to account for the relative value of each given recent drought conditions?
[youtube http://www.youtube.com/watch?v=h8Kgjid4-u0]
Likewise, while Max Levchin’s vision of the future recently had Nick Carr concerned about big brother (read my colleague Mathew Ingram’s take on it here) my takeaway from Carr’s blog post was more about the threat of a sterilized world. Human beings are not rational actors, and many of us don’t want to be — regardless of what the data says. We buy enormous sodas even though we don’t finish them, we demand all-you-can-eat data plans even though we don’t consume that much data and, directly addressing one of Levchin’s predictions, I bet many of us would willingly pay more for flat-rate auto insurance even if utility-style billing based on our real-time driving behavior would save us money.
Reducing the things we like — watching sports, eating, web surfing, driving — to data points ruins the experience of living carefree and exposes our optimistic anything-can-happen attitudes to a cold, surgical light. If I thought these were the pinnacle of data’s achievements, I’d rebel, too.

Data’s real promise is innovation

Thankfully, however, I’ve been lucky enough to spend my days speaking with some of the smartest data minds around and covering some truly revolutionary technologies. If there’s one thing I’ve learned, it’s that the real value of data isn’t just in uncovering statistical realities, but in finding methods for doing so where it was hitherto impossible and in creating entirely new products that change the way we interact with our world.
Big data is a technological revolution centered around collecting, storing and processing more data of more types than ever before. It’s also about doing all this stuff faster than ever before as data streams in from sensors, servers, Twitter, web surfing and however else we’re generating data. Data scientists are thinking up clever ways to stitch this data together, apply statistical techniques and do all sorts of things. They’re optimizing commerce, clearing traffic, insuring against inclement weather and even detecting genetic markers that might lead to a cure for cancer.

Climate Corporation's policies are based on some incredible data science.

Climate Corporation’s policies are based on some incredible data science.

If you want to hear a lot more about what’s possible, come to our Structure: Data conference March 20-21 in New York.
Yes, there’s some value to what David Brooks calls data-ism — there’s a lot to be learned simply from monitoring new data sources, and a renewed focus on visualization means interesting data is now presented in ways that anyone can and might actually want to digest. But the real reason people are, or should be, excited about data is the promise of doing important things faster and better than previously possible (where those things were even possible before).
Talk to me when you’re able to predict a flu outbreak in real time based on automobile traffic patterns, smart grid data on heater usage and an uptick in illness references on Twitter. If you just wanna tell me that, statistically speaking, chicken soup doesn’t actually appear to affect the longevity of the common cold, well, I think I’ll pass. Chicken soup makes me feel better.
Feature image courtesy of Shutterstock user Jirsak.