DARPA shows off its tech for indexing the deep web

On Sunday night, 60 Minutes aired a segment about the Defense Advanced Research Projects Agency, or DARPA, and its attempts to secure the internet from hackers, human traffickers and other criminals. One of the DARPA efforts the program highlighted — and did so even more in an unaired segment for the web — is a project called Memex, which is essentially a search engine for the deep web and the dark web.

The technology looks pretty amazing in a number of ways, including its scale, its speed and its interface. Of course, it’s also tackling a horrible and often under-appreciated problem, which is the illegal trafficking of women and girls as sex objects. Asked why DARPA is concerned with sex trafficking, Memex inventor Chris White explained that people willing to take part in that endeavor are often more likely to take part in other endeavors — including things like weapons or drug trafficking — that could have national security implications.

A Memex-generated map of sex trafficking.

A Memex-generated map of sex trafficking.

I wrote briefly about Memex last month, as part of a post about DARPA-funded research into machine learning algorithms — including computer vision and text analysis algorithms — for extracting even more info from deep web content.

The work DARPA is doing is part of a larger effort, which also includes tech companies like Google and Palantir, to identify and map instances of human trafficking around the world. It’s one of many problems that has existed for a long time, but that the internet has made easier to engage in. However, these efforts and others also show how the internet is making it easier for law-enforcement agencies to track and prosecute these crimes, provided the right analytical techniques are in place.

The 60 Minutes segment also featured DARPA innovation head Dan Kaufman, who spoke about web security at our Structure conference last June.

http://youtu.be/VXnFNd9WAAk

DARPA-funded research IDs sex traffickers with machine learning

Carnegie Mellon University is touting a new $3.6 million research grant from the Defense Advanced Research Projects Agency, or DARPA, to build machine learning algorithms that can index online sex ads in order to identify sex traffickers. The research is part of a larger DARPA program called Memex that aims to index seedy portions of the public web and deep web in order to identify any type of human trafficking on a larger scale.

One of the driving forces behind this type of effort is the simple fact that computers can analyze ads soliciting sex at a much greater scale than human investigators can. However, the press release announcing the DARPA grant noted, “In addition to analyzing obvious clues, CMU experts in computer vision, language technologies and machine learning will develop new tools for such tasks as analyzing the authors of ads or extracting subtle information from images.”

Even prior to this project, Carnegie Mellon said researchers at the university were working on the issue of sex trafficking and developed programs that law-enforcement agencies have already used to make arrests. That’s a reassuring piece of information considering that much university research, even the stuff involving serious issues, has a hard time making its way into the hands of law enforcement or others who can act on it.

Although, human trafficking for sex or otherwise does seem to be an issue that’s bringing together all sorts of organizations with unique abilities to combat it. Aside from the work at Carnegie Mellon, Google is doing a lot of work to identify victims and their traffickers, via targeted search results as well as partnerships with the Polaris Project and Palantir. There’s also Thorn, a non-profit started by Ashton Kutcher and Demi Moore that uses various technologies to identify cases of child exploitation online.