Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Aug;46(3):390-421.
doi: 10.1177/0049124115605339. Epub 2015 Oct 9.

Using Twitter for Demographic and Social Science Research: Tools for Data Collection and Processing

Affiliations

Using Twitter for Demographic and Social Science Research: Tools for Data Collection and Processing

Tyler H McCormick et al. Sociol Methods Res. 2017 Aug.

Abstract

Despite recent and growing interest in using Twitter to examine human behavior and attitudes, there is still significant room for growth regarding the ability to leverage Twitter data for social science research. In particular, gleaning demographic information about Twitter users-a key component of much social science research-remains a challenge. This article develops an accurate and reliable data processing approach for social science researchers interested in using Twitter data to examine behaviors and attitudes, as well as the demographic characteristics of the populations expressing or engaging in them. Using information gathered from Twitter users who state an intention to not vote in the 2012 presidential election, we describe and evaluate a method for processing data to retrieve demographic information reported by users that is not encoded as text (e.g., details of images) and evaluate the reliability of these techniques. We end by assessing the challenges of this data collection strategy and discussing how large-scale social media data may benefit demographic researchers.

Keywords: Twitter; attitudes; data collection; demographics; population.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Wordle Cloud. This figure illustrates the use of Wordle for preliminary text analysis. Larger terms signify that the words occur more frequently within the document. All words are normalized and stop words are removed from the corpus prior to visualization.

References

    1. Achrekar Harshavardhan, Gandhe Avinash, Lazarus Ross, Yu Ssu-Hsin, Liu Benyuan. First International Workshop on Cyber-Physical Networking Systems (CPNS) 2011. IEEE Infocom; Shanghai, China: 2011. Predicting Flu Trends Using Twitter Data.
    1. Anderson Stephanie L., Adams Glenn, Plaut Victoria C. The Cultural Grounding of Personal Relationship: The Importance of Attractiveness in Everyday Life. Journal of Personality and Social Psychology. 2008;95:352–68. - PubMed
    1. Beevolve [March 1, 2013];An Exhaustive Study of Twitter Users across the World. 2012 ( http://www.beevolve.com/twitter-statistics/)
    1. Behrend Tara S., Sharek David J., Meade Adam W., Wiebe Eric N. The Viability of Crowdsourcing for Survey Research. Behavior research methods. 2011;43:800–13. - PubMed
    1. Belli Robert F., Traugott Michael W., Young Margaret, McGonagle Katherine A. Reducing Vote Overreporting in Surveys: Social Desirability, Memory Failure, and Source Monitoring. Public Opinion Quarterly. 1999;63:90–108.

LinkOut - more resources