What Wikipedia can tell us about the future of news

It doesn’t get a lot of attention from media analysts during discussions about breaking news events like the recent shootings in Connecticut, but one of the first places I go whenever that kind of incident occurs is Wikipedia — and I’m usually amazed at how quickly and thoroughly a page about the event is created, updated and edited by unseen and often anonymous editors. Now a social-sciences researcher who specializes in studying Wikipedia as an information source has analyzed this phenomenon, with specific attention to mass shootings like the one at Sandy Hook, and given us a fascinating look inside one of the few crowdsourced online news efforts that occurs at that kind of scale.

Brian Keegan is a post-doctoral research fellow in computational social science at Northeastern University, and his research examines how social media like Twitter and Wikipedia can be used to improve predictive models relating to things like electoral success. For his PhD in media and technology, he looked at Wikipedia’s coverage of breaking news events like natural disasters, technological catastrophes, and political upheaval — and in a recent blog post, he combined a look at the Wikipedia article about Sandy Hook with some of his previous research into other similar incidents like the shootings in Aurora earlier this year.

In his post, which was also published at the Nieman Journalism Lab, Keegan describes how he analyzed Wikipedia’s coverage of seven events — including the recent shootings, the attack in Colorado, the mass killings in Norway in 2011, and the shootings at Virginia Tech in 2007. In almost every case, pages about these events were created within one or two hours of the incidents, and involved thousands of edits in the first day or two.

Some articles are edited hundreds of times an hour

What I found most interesting about Keegan’s research — apart from the fascinating graphs of Wikipedia activity, one of which is embedded below — is how the number of people who are contributing to or editing such an article evolves over time. In the early hours following the event, Keegan’s analysis shows that there are dozens or even hundreds of people contributing, but after about four hours the numbers begin to dwindle, and a smaller group takes over. As Keegan describes it:

“[T]he edits over the first few hours are evenly distributed: editors make a single contribution and others immediately jump in to also make single contributions as well. However, around hour 3 or 4, one or more dedicated editors show up and begin to take a vested interest in the article, which is manifest in the rapid centralization of the article. This centralization increases slightly over time across all articles suggesting these dedicated editors continue to edit after other editors move on.”


There isn’t a lot of hard data about what happens when large-scale crowdsourced journalism occurs around a breaking news event. Those involved in projects like The Guardian‘s massive MPs Expenses campaign from 2009 — in which more than 20,000 volunteers combed through more than 200,000 documents related to the expenses of British politicians — have written about the lessons learned, as have those involved in similar projects like ProPublica’s excellent “Free The Files” experiment during the recent federal election. From those kinds of projects, we have learned that making user contributions easy — lowering the barrier to participation — is important, as is adding an element of gamification or incentives.

But hard data about specific contributions and when they occurred isn’t all that common. Keegan describes one incredible statistic from his study:

“The Virginia Tech article was still being edited several times every minute even 36 hours after the event while other articles were seeing updates every five minutes more than a day after the event. This means that even at 3 am, all these articles are still being updated every few minutes by someone somewhere.”

A swarm of contributors, followed by a core of editors

The picture that emerges from Keegan’s research (his PhD dissertation is here) is of a large swarm of volunteer editors and contributors who descend on Wikipedia en masse during the first few hours after a news event, adding tiny details or editing out certain facts, fixing grammar, etc. And many of the contributors to the more recent incidents like Sandy Hook had never contributed to any of the other articles about similar shootings or mass killings. Within a matter of hours, however, a core group of editors shows up who have been involved in some or all of the previous breaking news articles — one in particular has edited every single mass shooting article within 48 hours of it being created.

In at least some cases, these editors appear to be part of Wikipedia’s semi-professional “cabal” of internal editors, but whatever their status it seems clear that they are individuals who have taken it upon themselves to oversee the “reporting” of that kind of event. In a very real sense, they are the equivalent of the grizzled desk editor or night editor who looks over the copy submitted by the young cub reporter — except of course that they are largely anonymous and unpaid, and so are the cub reporters they are overseeing.

Does the way that Wikipedia operates contain a solution to the chaos that emerges in social media following an event like the Sandy Hook shootings — which I’ve argued is the way that news works now? In a sense, it is a large-scale version of the real-time crowdsourced verification that Andy Carvin of NPR does, which has come under fire recently. But if nothing else, it should give some of those in the traditional media ideas: Could they turn a breaking news page into the equivalent of a Wikipedia article, and if not, why not?

Post and thumbnail images courtesy of Wikimedia Commons and Brian Keegan