Facebook relies on natural-language processing to power Graph Search

Since Facebook (s fb) debuted its Graph Search function in January, the social network has given access to a small percentage of users — millions, while Facebook had 1.06 billion monthly active users at the end of 2012. The feature aggregates people, places and things based on user input and quickly provides interesting and sometimes surprising content. That only happens thanks to nifty natural-language processing work that goes on behind the scenes. And it only works with English — for now. Engineers are trying to figure out how to make the product available in other languages.

In an article posted to Facebook’s engineering blog on Monday, research scientist Maxime Boucher and Xiao Li, engineering manager on the natural-language team in Graph Search, provide detailed information on the ways in which Graph Search calls on natural-language processing to guess what users want.

  • Graph Search breaks down search strings into multiple components that serve as commands with which the system can query the database. For instance, “my friends who live in San Francisco” would be run like this: pulling up the user, grabbing that person’s list of friends, calling on the filter for people who currently live in a place, and filtering out only those friends who have San Francisco in that field. Graph Search considers that search query “intersect(friends(me), residents(12345)).” And if that’s exactly what the user had in mind, that query gets converted into language for the Unicorn search engine to chew on.Search terms sometimes include words Graph Search has no use for. At other times, words for guiding queries are missing. And users might plug in terms in the wrong order. Say a user types in “friends San Francisco.” Graph Search might offer “my friends who live in San Francisco” as a good option. If it sees “San Francisco friends,” it could respond with “my friends who live in San Francisco,” which is more in accord with the correct sequence for a query.
  • Graph Search analyzes words users enter to look for possible entities that users are referring to in the database, across more than 20 entity categories, such as cities, employers and schools. Using statistics for the entity categories, the tool identifies sequences of words that could be more applicable for certain entities than others. If “san” precedes “francisco,” the user likely is referring to a city, not a person.
  • The system recognizes slang, nicknames for places, misspellings, the many ways of expressing particular types of data and other peculiarities that users type into the search box and swaps out each of those for terms that actually exist in the database. That means, for example, that subject-verb agreement isn’t necessary for the system to serve up query options that might lead to what users want to see. And words such as “besties” get interpreted as “friends.”

    Graph Search is visible on one of the most popular social networks in the world and therefore needs to be satisfying for its users. As Boucher and Li write, “The challenge for the team was to make sure that any reasonable user input produces plausible suggestions using Graph Search. To achieve that goal, the team leveraged a number of linguistic resources for conducting lexical analysis on an input query before matching it against terminal rules in the grammar.”

Graph Search still has a long to-do list for engineers to address. One of the biggest challenges is to construct and deploy a language-agnostic Graph Search system, so Facebook users all over the world will be able to do what English speakers can do with the tool. It will be difficult to produce a tool that can adjust for unusual spellings, handle incorrect grammar and otherwise optimize search strings entered in any language. “In Russian, there’s so many inflections around words and a lot of language-specific things we haven’t encountered in English,” Li told me in an interview on Friday. Engineers are now looking at different ways to make the tool available for other languages, Li said. One option? A whole lot of drop-down menus.

While there is still work to do in letting more people try Graph Search, it’s clear that the simple interface for navigating hundreds of millions of objects required engineers to produce a bunch of systems and models. It’s no Google, (s goog) Siri (s aapl) or DataPop, but, because it contains elements tailored to the data set at hand and common use cases, and because it’s getting better over time, Graph Search is worth keeping an eye on.