Why 2011 Will Be a Big Year for Big Data

During the 2011 NFL playoff TV broadcasts, amid commercials featuring Anheuser-Busch Clydesdales and auto-racing driver Danica Patrick, one ad features an IBM researcher talking about data analytics. While NFL TV broadcasts may seem an unusual forum for a discussion of data analytics, information management and analysis play an important role in professional sports. Or to put it plainly, big data has officially gone global.

With data volumes moving past terabytes to tens of petabytes and more, business and IT leaders across the board face significant opportunities and challenges from big data. For a large company, big data may be in the petabytes or more; for a small or mid-size enterprise, data volumes that grow into tens of terabytes become “big data.”

Better service through big data
For health care provider Kaiser Permanente and its more than 8 million members, big data is about improving the quality of care and reducing costs. Using electronic healthcare records and decision-support software, Kaiser doctors and nurses can view the patient’s complete history including lab test results, prescriptions, diagnosis, treatment, demographics, medical plan and payment records. Further, patients can avoid unnecessary trips to the hospital through a personalized online “My Health Manager” that allows them to securely email doctors and request mail-order prescription refills.

In the financial services industry, Fidelity National Information Services (FIS), which sells risk management and fraud detection services to credit card issuers, uses big data analytics to better detect credit card fraud. The nature of what FIS analysts do is highly ad hoc and interactive. They frequently run complex queries correlating multiple activities in different data sets, to stay one step ahead of credit card thieves. As they detect new methods of fraud, those methods become encoded into their company’s search algorithms and the operational systems that accept or decline a credit card transaction in real-time.

Visualize big data

Source: LinkedIn

As Stacey discusses, one of the biggest challenges is making data intelligible and accessible. Visualizations help business users identify patterns and take actionable steps. For example, LinkedIn Maps (above) enables users to map professional networks and understand relationships among connections. Your map is color-coded to represent different affiliations or groups from your professional career, such as your previous employer, college classmates or industries you’ve worked in. When you click on a contact within a circle, you’ll see their profile pop up on the right, as well as lines highlighting how they’re connected to your connections. To empower large-scale data computations of more than 100 billion relationships a day and low-latency site serving, LinkedIn uses a combination of Hadoop to process massive batch workloads, Project Voldemort for a NoSQL key/value storage engine and the Azkaban open-source workflow system to control ETL jobs.

While it’s exciting to see NFL playoff commercials and other organizations touting advanced analytics, the space is not without its limitations. It takes time to adopt best practices for these new technologies, and, more fundamentally, business processes often adapt slow to them. Data continues to remain silo’ed, and let’s not forget the numerous privacy issues that continue to crop up. Nonetheless, 2011 is shaping up to be a big year for big data.

To read more about the opportunities, players, business models and challenges in the space, check out our 2011 Big Data Preview at GigaOM Pro(subscription required). For more insights from the big data landscape, come to GigaOM’s Structure: Big Data conference on March 23 in New York City.

Image courtesy of Microsoft.

Related Content From GigaOM Pro (subscription required)