How Twitter confirmed the explosion in Harlem first

Every day, people break news on Twitter (s TWTR) first. This creates a new real-time global sensor network of eyewitnesses at events as they unfold.

However, the process of verifying information during a fast breaking news event can pose a challenge. The sources are unique — they are in the right place at the right time tweeting what they see. But they are just regular people with Twitter accounts, previously unknown to the world at large, but now suddenly on the front line of breaking real-world events.

This unique aspect of Twitter has led some to question the reliability of the information produced. However, many overlook the fact that the aggregate data patterns of early eyewitness tweets can often provide one of the most accurate lenses through which information can be confirmed.

The reason is simple: Twitter itself is one of the most effective means for verifying tweets.

This week, a devastating explosion in Harlem claimed several lives and left New Yorkers deeply saddened. Examining Twitter data from immediately following the accident reveals information that might otherwise not have been discovered. At Dataminr, the company I founded, we have a large team of data scientists who have spent years analyzing the various data patterns of how events propagate on Twitter. (I’ll be talking more about the work we do at Structure Data March 19-20.) We have learned just how telling the patterns in Twitter data can be. During breaking news events, even a small number of Tweets can provide enough data for our algorithms to characterize the event and determine with high confidence the validity, relevance and actionability of rapidly emerging information.

In just the first minutes after the Harlem explosion, aggregate Twitter data revealed a lot about what had happened. People acted collectively as an on-the-ground detection and sensory network, depicting the scene with granularity long before first responders or reporters arrived.

 These Twitter eyewitnesses in Harlem provided a mosaic of images and first-hand accounts — all emerging from one location in a short time. Additionally, the geo-proximity of tweets, the shape and rate of tweet propagation, and the linguistic signatures of the messages quickly illustrated the potential magnitude and importance of what had occurred — all before traditional information sources had even arrived on site.

According to the NYC Fire Department, the explosion took place around 9:31 AM EDT. Twelve minutes passed before local news reported that a serious event had occurred. It wasn’t until 20 minutes after the explosion that the first major news coverage of the tragedy appeared. A wealth of detailed information flowed through Twitter immediately following the explosion and continued throughout those first 12 minutes — and well beyond.

The graphic below shows a set of geo-localized tweets that Dataminr algorithms clustered together during these initial 12 minutes. Before any single source could confirm the event, the truth was already in the tweets. The unique pattern of tweets painted a specific picture confirming the event was in fact happening and that the people tweeting thought it was a “big deal.” The descriptions in the tweets provided raw and important insight into how these eyewitnesses were experiencing the event in real-time. Dataminr spotted the pattern and generated an alert within 200 seconds.

Dataminr.Harlem Explosion Map

When people say, “Information on Twitter can be unreliable,” they overlook one of Twitter data’s most unique attributes — its ability to verify itself. When looking at a single tweet from an unfamiliar source, it can sometimes be difficult to be certain about its veracity. When tweets are looked at in the aggregate around breaking events and clustered together, the adjacent “dots” can be synthesized to paint one of the most definitive data-descriptions of an unfolding event.

Can Twitter be an independent means for verifying all information? Not in every case — but often traditional systems for confirming events are not perfect either. Bottom line: Twitter needs to be a critical and irreplaceable part of any process that seeks to verify breaking events.

Ted Bailey is the founder and CEO of Dataminr. Follow him on Twitter @TedBailey.