I take a lot of photos on my smartphone. So many, in fact, that my wife calls me Cellphone Ansel Adams. I can’t imagine how many more digital photos we’d have cluttering up our hard drives and cloud drives if I ever learned how to really use the DSLR.
So I get excited when I read and write about all the advances in computer vision, whether they’re the result of deep learning or some other technique, and all the photo-related acquisitions in that space (Google, Yahoo, Pinterest, Dropbox and Twitter have all bought computer vision startups). I’m well aware there are much wider-ranging and important implications, from better image-search online to disease detection — and we’ll discuss them all at our Structure Data conference in March — but I personally love being able to search through my photos by keyword even though I haven’t tagged them (we’ll probably discuss that at Structure Data, too).
I love that Google+ can detect a good photo, or series of photos, and then spice it up with some Auto-Awesome.
Depending on the service you use to manage photos, there has never been a better time to take too many of them.
If there’s one area that has lagged, though, it’s the creation of curated photo albums. Sometimes Google makes them for me and, although I like it in theory (especially for sharing an experience in a neatly packaged way), they’re usually not that good. It will be an album titled “Trip to New York and Jersey City,” for example, and will indeed include a handful of photos I took in New York, just usually not the ones I would have selected.
Although I’m not about to go through my thousands of photos (or even dozens of photos the day after a trip) and create albums, I’ll gladly let a service to do it for me. But it’s only if the albums are good that I’ll do something beyond glance at them. Usually, I love getting the alert that an album is ready, and then get over the excitement really quickly.
So I was interested to read a new study by Disney Research discussing how its researchers have developed an algorithm creates photo albums based on more factors than just time and geography, or even whether photos are “good.” The full paper goes into a lot more detail about how they trained the system (sorry, no deep learning) but this description from a press release about it sums up the results nicely:
[blockquote person=”” attribution=””]To create a computerized system capable of creating a compelling visual story, the researchers built a model that could create albums based on variety of photo features, including the presence or absence of faces and their spatial layout; overall scene textures and colors; and the esthetic quality of each image.
Their model also incorporated learned rules for how albums are assembled, such as preferences for certain types of photos to be placed at the beginning, in the middle and at the end of albums. An album about a Disney World visit, for instance, might begin with a family photo in front of Cinderella’s castle or with Mickey Mouse. Photos in the middle might pair a wide shot with a close-up, or vice versa. Exclusionary rules, such as avoiding the use the same type of photo more than once, were also learned and incorporated.[/blockquote]
It’s just research and surely isn’t perfect, but it feels like a step in the right direction. It could make sharing photos so much easier and more enjoyable for everyone involved. There’s no doubt the folks at Google, Yahoo and elsewhere are already working on similar things so they can roll them out across services such as Flickr and Google+.
Remember physical slide shows with projectors? The same rules still apply: Your aunt and your friends don’t want to skip through 5 pictures of your finger over the lens, marvel at the beauty of the same rock formation shot from 23 slightly different angles, or laugh at that at that sign that you had to be there to get why it’s funny. They want a handful of pictures of you looking nice in front of famous landmarks or pretty sunsets. Probably on their phone while waiting in line at the checkout.
I don’t always have the self-control or editorial sense to deliver that experience. I’ll be happy if an algorithm can do it for me.