Here’s more evidence that sports is a goldmine for machine learning

If you really like sports and you’re really skilled at data analysis or machine learning, you might want to make that your profession.

On Thursday, private equity firm Vista announced it has acquired a natural-language processing startup called Automated Insights and will make it a subsidiary of STATS, a sports data company that Vista also owns. It’s just the latest example of how much money there is to be made when you combine sports, data and algorithms.

The most-popular story about Automated Insights is that its machine-learning algorithms are behind the Associated Press’s remarkably successful automated corporate-earnings stories, but there’s much more to the business than that. The company claims its algorithms have a place in all sorts of areas where users might want to interact with information in natural language — fitness apps, health care, business intelligence and, of course, sports.

In fact, someone from Automated Insights recently told me that fantasy sports is a potential cash cow for the company. Because its algorithms can analyze data and the outcomes of individual matchups, it can deliver everything from in-game trash-talk to post-game summaries. The better the algorithms are at mimicking natural language (i.e., not just regurgitating stats with some static nouns and verbs around them), the more engaging the user experience — and the more money the fantasy sports platform, and Automated Insights as a partner, make. Automated Insights already provides some of this experience for Yahoo Sports.


So it’s not surprising that STATS would acquire Automated Insights. STATS provides a lot of data products to broadcasters and and folks selling mobile and web applications, ranging from analysis to graphics to its SportVU player-tracking system. At our Structure Data conference next month in New York, STATS Executive Vice President of Pro Analytics Bill Squadron will be on stage along with ESPN’s vice president of data platforms, Krish Dasgupta, to discuss how the two companies are working together the sate an ever-growing sports-fan thirst for data. (We’ll also have experts in machine learning and deep learning from places such as Facebook, Yahoo and Spotify discussing the state of the are in building machines that understand language, images and even music.)

And Automated Insights isn’t even STATS’s first acquisition this week. On Tuesday, the company announced it had acquired The Sports Network, a sports news and data provider. In September, STATS acquired Bloomberg Sports.

More broadly, though, the intersection of sports and data is becoming a big space with the potential to be huge. Every year around this time, people in the United States start going crazy over the NCAA collegiate men’s basketball tournament (aka March Madness) and spend billions of dollars betting on it in office pools and at sports books. And every year for the past several, we have been seeing more and more predictive models and other tools for helping people predict who’ll win and lose each game.


Statistician superstar Nate Silver might be best known for his ability to predict elections, but he has been applying his trade to sports including baseball and the NCAA tournament for years, too. It’s no wonder ESPN bought him and his FiveThirtyEight blog and turned it into a full-on news outlet that includes a heavy emphasis on sports data.

The National Football League might present the biggest opportunity to cash in on sports data. Aside from the ability to predict games and player performance (gambling on the NFL — including fantasy football — is a huge business), we now see individuals making their livings with football-analysis blogs that turn into consulting gigs. There’s a growing movement to tackle the challenge of predicting play calling by applying machine learning algorithms to in-game data.

Even media companies are getting into the act. The New York Times dedicates resources to analyzing every fourth down in every NFL game and telling the world whether the coach should have punted, kicked a field goal or gone for it. In 2013, Yahoo bought a startup called SkyPhrase (although it folded the personnel into Yahoo Labs) that developed a way to deliver statistics in response to natural language queries. The NFL was one of its first test cases.

A breakdown of what happens on fourth down.

A breakdown of what happens on fourth down.

Injuries are also a big deal, and there is no shortage of thought, or financial investment, into new ways of analyzing measuring what’s happening with players’ bodies so teams can better diagnose and prevent injuries. Sensors and cameras located near the field or even on players’ uniforms, combined with new data analysis methods, provide a great opportunity for unlocking some real insights into player safety.

All of this probably only skims the surface of what’s being done with sports data today and what companies, teams and researchers are working for tomorrow. So while analyzing sports data might not save the world, it might make you rich. If you’re into that sort of thing.

With $8M and star team, MetaMind does deep learning for enterprise

A Palo Alto startup called MetaMind launched on Friday promising to help enterprises use deep learning to analyze their images, text and other data. The company has raised $8 million from Khosla Ventures and Marc Benioff, and and Khosla operating partner and CTO Sven Strohband is its co-founder and CEO. He’s joined by co-founder and CTO Richard Socher — a frequently published researcher — and a small team of other data scientists.

Natural language processing expert Chris Manning of Stanford and Yoshua Bengio of the University of Montreal, considered one of the handful of deep learning masters, are MetaMind’s advisers.

Rather than trying to help companies deploy and train their own deep neural networks and artificial intelligence systems, as some other startups are doing, MetaMind is providing simple interfaces for predetermined tasks. Strohband thinks a lot of users will ultimately care less about the technology underneath and more about what it can do for them.

“I think people, in the end, are trying to solve a problem,” he said.

Sven Strohband (second from left) at Structure Data 2014.

Sven Strohband (second from left) at Structure Data 2014.

Right now, there are several tools (what the company calls “smart modules”) for computer vision — including image, localization and segmentation — as well as for language. The latter, where much of Socher’s research has focused, includes modules for text classification, sentiment analysis and question-answering, among other things. (MetaMind incorporates a faster, more accurate version of the etcML text-analysis service that Socher helped create while pursuing a Ph.D. at Stanford.)

During a briefing on MetaMind, Socher demonstrated a capability that merges language and vision and that’s similar, inversely, to a spate of recent work from Google, Stanford and elsewhere around automatically generating detailed captions for images. When he typed in phrases such as “birds on water” or “horse with bald man,” the application surfaced pictures fitting those descriptions and even clustered them based on how similar they are.

Testing out MetaMind's sentiment analysis for Twitter.

Testing out MetaMind’s sentiment analysis for Twitter

Socher and Strohband claim MetaMind’s accuracy in language and vision tasks is comparable to, if not better than, previous systems that have won competitions in those fields. Where applicable, the company’s website shows these comparisons.

MetaMind is also working on modules for reasoning over databases, claiming the ability to automatically fill in missing values and predict column headings. Demo versions of several of these features are available on the company’s website, including a couple that let users import their own text or images and train their own classifiers. Socher calls this “drag-and-drop deep learning.”

The bare image-training interface.

The bare image-training interface

On the surface, the MetaMind service seems similar to those of a couple other deep-learning-based startups, including computer-vision specialist Clarifai but especially AlchemyAPI, which is rapidly expanding its collection of services. If there’s a big difference on the product side right now, it’s that AlchemyAPI has been around for years and has a fairly standard API-based cloud service, and a business model that seems to work for it.

After being trained on 5 pics of chocolate chip cookies and five pics of oatmeal raisin cookies, I tested it on this one.

After being trained on five pictures of chocolate chip cookies and five pictures of oatmeal raisin cookies, I tested it on this one.

MetaMind is only four months old, but Strohband said the company plans to keep expanding its capabilities and become a general-purpose artificial intelligence platform. It intends to make money by licensing its modules to enterprise users along with commercial support. However, it does offer some free tools and an API in order to get the technology in front of a lot of users to gin up excitement and learn from what they’re doing.

“Making these tools so easy to use will open up a lot of interesting use cases,” Socher said.

Asked about the prospect of acquiring skilled researchers and engineers in a field where hiring is notoriously difficult — and in a geography, Palo Alto, where companies like [company]Google[/company] and [company]Facebook[/company] are stockpiling AI experts — Socher suggested it’s not quite as hard as it might seem. Companies like MetaMind just need to look a little outside the box.

“If [someone is] incredibly good at applied math programming … I can teach that person a lot about deep learning in a very short amount of time,” he said.

He thinks another important element, if MetaMind is to be successful, will be for him to continue doing his own research so the company can develop its own techniques and remain on the cutting edge. That’s increasingly difficult in the world of deep learning and neural network research, where large companies are spending hundreds of millions of dollars, universities are doubling down and new papers are published seemingly daily.

“If you rest a little on your laurels here,” Strohband said, “this field moves so fast [you’ll get left behind].”

Allen Foundation gives millions to teach machines common sense

The Paul G. Allen Foundation announced on Wednesday that it has awarded $5.7 million in grants to five projects that aim to teach machines to understand what they see and read. That can be anything from a photograph to a chart, a diagram to an entire textbook.

Researchers are cracking text analysis one dataset at a time

A handful of new research projects from Google, IBM and the Allen Institute for AI highlight the ongoing quest to build computer systems capable of analyzing written language based on understanding concepts rather than just keywords.