Scientists say tweets predict heart disease and community health

University of Pennsylvania researchers have found that the words people use on Twitter can help predict the rate of heart disease deaths in the counties where they live. Places where people tweet happier language about happier topics show lower rates of heart disease death when compared with Centers for Disease Control statistics, while places with angry language about negative topics show higher rates.

The findings of this study, which was published in the journal Psychological Science, cut across fields such as medicine, psychology, public health and possibly even civil planning. It’s yet another affirmation that Twitter, despite any inherent demographic biases, is a good source of relatively unfiltered data about people’s thoughts and feelings, well beyond the scale and depth of traditional polls or surveys. In this case, the researchers used approximately 148 million geo-tagged tweets from 2009 and 2010 from more than 1,300 counties that contain 88 percent of the U.S. population.

(How to take full advantage of this glut of data, especially for business and governments, is something we’ll cover at our Structure Data conference with Twitter’s Seth McGuire and Dataminr’s Ted Bailey.)

tweetsheart

What’s more, at the county level, the Penn study’s findings about language sentiment turn out to be more predictive of heart disease than any other individual factor — including income, smoking and hypertension. A predictive model combining language with those other factors was the most accurate of all.

That’s a result similar to recent research comparing Google Flu Trends with CDC data. Although it’s worth noting that Flu Trends is an ongoing project that has already been collecting data for years, and that the search queries it’s collecting are much more directly related to influenza than the Penn study’s tweets are to heart disease.

That’s likely why the Penn researchers suspect their findings will be more relevant to community-scale policies or interventions than anything at an individual level, despite previous research that shows a link between emotional well-being and heart disease in individuals. Penn professor Lyle Ungar is quoted in a press release announcing the study’s publication:

“We believe that we are picking up more long-term characteristics of communities. The language may represent the ‘drying out of the wood’ rather than the ‘spark’ that immediately leads to mortality. We can’t predict the number of heart attacks a county will have in a given timeframe, but the language may reveal places to intervene.”

The researchers’ work is part of the university’s Well-Being Project, which has also used Facebook users’ language to build personality profiles.

map plot - FINAL

Indonesia is mapping Jakarta floods in real time using Twitter

ptjk

After initially mapping 8 million flood-related tweets throughout the region over the past couple years as part of a Twitter Data Grant, the University of Wollongong and a local emergency agency have developed a project called PetaJakarta that builds a real-time map of areas affected by floods, based on geo-tagged tweets directed to the project using a specific hashtag. According to a Twitter blog post announcing the project, the goal is to help emergency workers and citizens in one of the world’ most-populous areas understand how floods are moving and what areas have been hit the hardest.

Twitter visualizes a World Cup shootout in tweets

Twitter has released an analysis of activity on the social network during the overtime shootout period in last week’s World Cup match between Brazil and Chile. The pattern, which Twitter claims has repeated itself through every overtime shootout, is pretty interesting: people tweet like crazy leading up to the kick, watch intently (and with hands off keyboards) as the player gets ready and finally kicks, and then tweet like crazy again after the kick scores or misses. Seeing this phenomenon visualized is a small window into the relationships between our eyes, fingers, televisions and computer screens during big events.
penalty_snapshot

Repeat after me: ‘Google is not a proxy for big data’

Another study is reporting on the inaccuracy of Google Flu Trends project, which predicts seasonal flu rates based on search data. However, Google’s algorithms don’t constitute the “big data” approach to this issue, they’re just one piece of a smart big data approach.

Topsy could help fill in Apple’s big hole — big data

The tech world is wondering how Apple plans to utilize the assets it acquired by buying Topsy, which focuses on collecting and analyzing Twitter data. I suspect Apple is trying to fill a big data void in its platform battle against Google.

Gnip and WordPress deepen ties, expand data partnership

Exclusive. Gnip is making it very clear it’s not just about Twitter anymore. The company, which provides aggregated API access to a variety of social media streams, has significantly expanded its partnership with Automattic, the company that runs WordPress.com.