Why Big Data Startups Should Take a Narrow View

Looking back on last week’s Structure Big Data conference, one of the statements that struck me most was CA (s ca) CTO Donald Ferguson’s notion that big data represents a “very promising” opportunity for startups, particularly those targeting specific target use cases. I think he’s right, particularly with regard to the latter part: the market for horiztontally focused products is filling up fast with both startups and large vendors, so innovative companies might look at how to best tune big data tools for specific industries.

As I explained in detail last week, Hadoop has become popular among companies of all sizes, but most products at this point target broad use cases across industries. Yes, there’s still room for startups to get in here, but the door looks to be closing fast. It’s not just Hadoop, either; other techniques, from tradtional data warehouses to, arguably, predictive analytics, all are nearing the saturation point in terms of vendors selling the core technologies. Even a step up the stack from the core Hadoop layer are vendors like Datameer selling familiar-looking interfaces that abstract the complexities of processing and analyzing data with Hadoop.

But Ferguson made a particularly poignant, if not novel, observation: analyzing social media data is not the same, either in technique or in purpose, as analyzing user data to feed a recommendation engine for a site like Netflix. And herein lies the opportunity. Organizations keep on hearing about big data and about how big an opportunity it is, but even though the technology to capitalize on this opportunity is getting democratized, organizations still face a big challenge to hire personnel that understand not only the technology, but also how to ask right the right questions. Sure, analyzing social media data sounds great to find out what consumers like or how they might act sounds great, but actually being able to do it accurately is another issue. It’s a situation just begging for startups to fill the void between big data tools and actually using them for a particular task.

Whether the focus is by industry (e.g., tools for financial services, retail, etc.) or by use case (e.g., sentiment analysis, recommendation engines, etc.), one can easily envision an emerging class of companies tuning technologies like Hadoop or predictive analytics software to directly address these discrete classes of users. Organizations won’t necessarily need data scientists to “turn information into gold” if the data scientists employed by their software vendors have already done most of the work. Think about it like functions within spreadsheet applications tuned to specific industries, or like how PaaS startups took cloud computing a step further by configuring infrastructure with the push of a button. Just feed the application some data, push a button, and get results — no Ph. D. required.

To a degree, this is already starting to happen, but primarily by large vendors using their existing software (e.g., SAS (s sas) for social media) and in the form of fairly limited-scope analytic technologies (e.g., graph databases), but I think these are just baby steps toward what could be a huge opportunity. Companies of all types want to be the next Yahoo or Facebook in terms of big data, and there are plenty of companies willing to help them do that in terms of infrastructure. The real opportunity now is in helping companies figure out how to use it.

Image courtesy of Pam Brophy.