Facebook on Friday open sourced a handful of software libraries that it claims will help users build bigger, faster deep learning models than existing tools allow.
The libraries, which [company]Facebook[/company] is calling modules, are alternatives for the default ones in a popular machine learning development environment called Torch, and are optimized to run on [company]Nvidia[/company] graphics processing units. Among the modules are those designed to rapidly speed up training for large computer vision systems (nearly 24 times, in some cases), to train systems on potentially millions of different classes (e.g., predicting whether a word will appear across a large number of documents, or whether a picture was taken in any city anywhere), and an optimized method for building language models and word embeddings (e.g., knowing how different words are related to each other).
“‘[T]here is no way you can use anything existing” to achieve some of these results, said Soumith Chintala, an engineer with Facebook Artificial Intelligence Research.
That team was formed in December 2013 when Facebook hired prominent New York University researcher Yann LeCun to run it. Rob Fergus, one of LeCun’s NYU colleagues who also joined Facebook at the same time, will be speaking on March 19 at our Structure Data conference in New York.
Despite the sometimes significant improvements in speed and scale, however, the new Facebook modules probably are “not going to be super impactful in terms of today’s use cases,” Chintala said. While they might produce noticeable improvements within most companies’ or research teams’ deep learning environments, he explained, they’ll really make a difference (and justify making the switch) when more folks are working on stuff at a scale like Facebook is now — “using models that people [previously] thought were not possible.”
Perhaps the bigger and more important picture now, then, is that Friday’s open source releases represent the start of a broader Facebook effort to open up its deep learning research the way it has opened up its work on webscale software and data centers. “We are actually going to start building things in the open,” Chintala said, releasing a steady stream of code instead of just the occasional big breakthrough.
Facebook is also working fairly closely with Nvidia to rework some of its deep learning programming libraries to work at web scale, he added. Although it’s working at a scale beyond many mainstream deep learning efforts and its researchers change directions faster than would be feasible for a commercial vendor, Facebook’s advances could find their way into future releases of Nvidia’s libraries.
Given the excitement around deep learning right now — for everything from photo albums to self-driving cars — it’s a big deal that more and better open source code is becoming available. Facebook joins projects such as Torch (which it uses), Caffe and the Deeplearning4j framework being pushed by startup Skymind. Google has also been active in releasing certain tooling and datasets ideal for training models.
It was open source software that helped make general big data platforms, using software such as Hadoop and Kafka, a reality outside of cutting-edge web companies. Open source might help the same thing happen with deep learning, too — scaling it beyond the advances of leading labs at Facebook, Google, Baidu and Microsoft.