Researchers from the University of Pennsylvania have demonstrated that Twitter can serve as a dashboard indicator of a community’s psychological well-being and can predict county-level rates of heart disease.
The study, published in Psychological Science, showed that expressions of negative emotions such as anger, stress, and fatigue in the tweets from people in a given county were associated with higher heart disease risk in that county. On the other hand, expressions of positive emotions like excitement and optimism were associated with lower risk. Previous studies have identified many factors that contribute to the risk of heart disease, including behavioral factors such as smoking and psychological factors like stress.
The results suggest that using Twitter as a window into a community’s collective mental state may provide a useful tool in epidemiology. “Getting this data through surveys is expensive and time consuming, but, more important, you’re limited by the questions included on the survey,” said psychological scientist Johannes Eichstaedt, who led the study. “You’ll never get the psychological richness that comes with the infinite variables of what language people choose to use.”
Having seen correlations between language and emotional states in previous research, the scientists wanted to see if they could find evidence linking those emotional states to physical outcomes. As a common cause of early mortality, public health officials carefully count when heart disease is identified as the underlying cause of death. They also collect meticulous data about possible risk factors, such as rates of smoking, obesity, hypertension, and lack of exercise. This data is available on a county-by-county level in the U.S., so the research team aimed to match this physical epidemiology with their digital Twitter version.
Drawing on a set of public tweets made between 2009 and 2010, the researchers used established emotional dictionaries, as well as automatically generated clusters of words reflecting behaviors and attitudes, to analyze a random sample of tweets from individuals who had made their locations available. There were enough tweets and health data from about 1,300 U.S. counties, which contain 88 percent of the country’s population.
The researchers found that negative emotional language and topics, such as words like “hate” or expletives, remained strongly correlated with heart disease mortality, even after variables like income and education were taken into account. Positive emotional language showed the opposite correlation, suggesting that optimism and positive experiences, words like “wonderful” or “friends,” may be protective against heart disease.