How Intel is betting on big data to add tens of millions to its bottom line

Big data is nothing new to Intel — the microprocessor giant has been analyzing lots of data about its chip-manufacturing processes for decades — but there’s something about the current generation of “big data” technologies that has the company excited about the future. Tools like Hadoop and techniques such as machine learning at scale are making it easier for Intel to collect and analyze everything from wafer measurements to sales figures, and the result is that Intel is getting much smarter about running its business.

Nothing is more synonymous with Intel than its microprocessor business, and in an age where everyone is talking about the Industrial Internet, manufacturing facilities are increasingly synonymous with big data. And, indeed, Intel’s chip-manufacturing process does generate a lot of data. For years, Aziz Safa, GM of Intel’s internal IT group, told me, the company has been focused on certain types of data — primarily the utilization rates of its machines and information about each chip at each step in the manufacturing process — but the company has really stepped up its machine learning efforts lately.

The tracks on the ceiling transport chips along the production process. Source: Intel

The tracks on the ceiling transport chips along the production process. Source: Intel

More data means fewer errors

Now, Intel is starting to capture more data from its sensors in order to identify patterns that might help optimize the process and save the company time and money. For example, Safa explained, Intel used to do root-cause analysis whenever something went wrong with batch of chips, but it always involved a human digging in to find out the first step in that chain of events. Machine learning makes this possible without a human, because the algorithms can sift through potentially thousands of data points about each chip to find the common patterns among those that got messed up.

What’s more, Safa said, the more data Intel collects about its chips, machinery and processes, the more data it has from which to discover new and meaningful parameters. The company tracks these over time to see if they really do have a predicitive effect, rolling the ones that do into the model and ignoring those that don’t. Ultimately, he noted, this helps Intel be smarter about the data it collects even though it can now store a lot more data for a lot less money than previously possible.

Intel is able to store all this data because of Hadoop, and it has been doing so using its own distribution of Hadoop for about a year. Safa calls Hadoop “a cheap method for storing data that otherwise we would have ignored.” Even if scale wasn’t historically an issue, structure was: Every manufacturer might have its own format for the data streaming off their machines, and some might be images while others are logs or more obscure file types. With Hadoop, Safa said, Intel is able to dump everything in one place and then normalize it or analyze it however makes sense.

Optimizing sales with predictive analytics

However, Safa noted, “Advanced analytics is not something new we’re doing for manufacturing, it’s something new were doing outside manufacturing.”

He elaborated about how Intel is also using big data in order to inform better sales and marketing decisions, too. The idea there is to collect historical data about Intel’s 140,000 customers in order to let sales reps focus on the right ones (kind of like an internal version of Infer). One part of this process is a similarity analysis of sorts to find customers that have the same types of buying patterns or perhaps similar needs, kind of like how Amazon recommends products that are often purchased together or viewed by the same people.


It appears that recommendation engine is built using a Hadoop-based set of machine learning libraries called Mahout, at least according to a July 2012 Intel whitepaper describing then-current big data proof-of-concept projects.

intelprediction2Right now, the customer-ranking process runs quarterly just before Intel’s sales team is assigned new accounts, another whitepaper explains, although the plan is to make the process more timely in some respects. By using Hadoop to process more unstructured data about accounts (it’s presently a data warehouse-driven process), Safa said Intel hopes to give salespeople alerts about when is the best time to engage certain customers and what they might need.

He wouldn’t go into details about just how much money the company’s big data efforts might make or save it, but he did say the company doesn’t devise proof-of-concept projects for problems worth less than $10 million a year and that the work around keeping the manufacturing facilities operating optimally is definitely increasing profits. The whitepaper laying out Intel’s plans for improving sales says the project version of the system resulted in $3 million in estimated incremental revenue and is expected to result in an additional $20 million when rolled out globally.