For most of its time as an IT industry buzzword, big data has been focused on numbers and letters. Sales numbers, medical results, weather, sensor readings, tweets, news articles — all very different, but also all relatively low-hanging fruit. Now, however, it looks like video is emerging as the next great source for companies to learn about consumers, and for everyone to learn about the world around them.
Thanks to surveillance cameras, GoPros, Dropcams, cell phones and even old-fashioned camcorders, we’re able to record video at unprecedented scale. YouTube sees 100 hours of new content added every minute. But it has been kind of a wasteland of information. There’s lots of it embedded in all those frames, but without accurate tags or someone willing to watch all that video, it might as well have been uploaded into a blackhole.
Who’s in them, what’s going on in them and where are they shot? Who knows.
Taking computer vision to the next level
Lately, though, techniques such as deep learning and other varieties of machine learning have led to impressive advances in areas such as computer vision, speech recognition and language understanding. The companies doing this research, largely at places like Google, Microsoft and now Yahoo, are already using the technology in production on things like voice commands on gaming consoles and cell phones, and on recognizing images in online galleries in order to label and categorize them.
It’s not too big a step to turn these techniques toward video. Researchers at the University of Texas are already using object recognition to create short summaries of long videos so people can know what they’re about without having to rely on titles alone. According to AlchemyAPI Founder and CEO Elliot Turner (who’ll be speaking at Structure Data in March about the promise of delivering artificial intelligence capabilities via API), video is in some ways actually easier to work with than images because the temporal natures of the frames adds context that can help self-learning systems understand what’s happening.
But video data has even more utility than just helping web companies like Google or Facebook understand what’s happening in YouTube or Instagram videos. It’s also a window into our physical world like no other type of data before.
Retail is ground zero for video analytics
Retailers and companies targeting their business have been particularly quick catching onto this. Already, they’re using video analysis to figure out when stores are the busiest and where people are walking, stopping and looking. Some are even using eye-level cameras to identify which items people are looking at on fully stocked shelves. Facial recognition software is helping stores assess shoppers’ age, sex and race to target ads and provide accurate data about consumer demographics.
Steve Russell, founder and CEO of a video analytics company called Prism Skylabs — and who’ll be discussing the role of video as a new source of business intelligence at Structure Data — said the goal is partially to give brick-and-mortar retailers the type of information that e-retailers already get about what people look at but don’t buy. Having that type of information can help retailers get a better sense of what inventory they should carry even where they should put it in the store.
“Imagine if all Amazon knew is how much stuff they sold?” he asked during an interview back in November.
It’s not just about tracking customers’ activity, though. Russell explained that Prism Skylabs’ technology actually uses advanced computer vision techniques (he calls them “super-resolution algorithms”) to take people out of the picture and give its users a clear view of an empty store, even if they’re using low-resolution cameras that often look grainy without processing. This helps with privacy concerns but also gives retailers and their merchandisers real-time views into a store to make sure they’re clean, shelves are stocked and that other protocols are being followed.
“All of those are questions that a merchandiser will have to travel to a store with a clipboard to answer, and it’s incredibly expensive,” Russell said.
And given the low costs of cameras now, and the fact that services like Prism are delivered via the cloud, companies can get as granular as they want with their video analysis without having to worry about breaking the bank on cameras or servers and software to store and process the data. Video provides what Russell calls a “vast array of useful tidbits,” and it clearly has potential beyond the retail space. But, he said, “We don’t know of all the things we can do potentially with this very interesting type of data.”
One thing he does know, though: “The core problems of computer vision have largely been solved in the past few years.” Today, Russell added, if you have access to a training set of images and an established truth of what they are and mean, “You can train a computer to do amazing things.”