Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Aug 19;10(8):e0134270.
doi: 10.1371/journal.pone.0134270. eCollection 2015.

Testing Propositions Derived from Twitter Studies: Generalization and Replication in Computational Social Science

Affiliations

Testing Propositions Derived from Twitter Studies: Generalization and Replication in Computational Social Science

Hai Liang et al. PLoS One. .

Abstract

Replication is an essential requirement for scientific discovery. The current study aims to generalize and replicate 10 propositions made in previous Twitter studies using a representative dataset. Our findings suggest 6 out of 10 propositions could not be replicated due to the variations of data collection, analytic strategies employed, and inconsistent measurements. The study's contributions are twofold: First, it systematically summarized and assessed some important claims in the field, which can inform future studies. Second, it proposed a feasible approach to generating a random sample of Twitter users and its associated ego networks, which might serve as a solution for answering social-scientific questions at the individual level without accessing the complete data archive.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Unequal content generation.
(A) Log-log plot of the complementary cumulative distribution functions of the number of tweets per user. (B) Cumulative percentage of tweets created by cumulative percentage of users. Colors indicate that only the users who have posted more than N tweets are included. The selection of N does not influence the distribution qualitatively. The number of tweets (including original posts, retweets, and replies) for each ego were obtained from the user profile API, and therefore, that is not subject to the 3,200-limit of the user timeline API.
Fig 2
Fig 2. Daily and weekly rhythms of Twitter activity.
(A) Tweets posted by hour and day of the week. (B) Number of active users by hour and day of the week. We used the UTC-offset information provided by the REST API to normalize time stamps to local time (see S1 File).
Fig 3
Fig 3. Content productivity and attention received.
The average number of tweets as a function of (A) the number of followers, (B) the number of followees, and (C) the number of friends. Friend here is defined as a user who has been mentioned at least twice in an ego’s timeline.
Fig 4
Fig 4. Degree distribution in the follower-followee network.
Log-log plot of the complementary cumulative distribution functions of the number of followers, the number of followees, and the number of reciprocal friends. The number of followers and followees for each user were obtained from the user profile API, therefore, are not constrained by the privacy setting for obtaining following relationships.
Fig 5
Fig 5. Measuring Dunbar’s number.
(A) The average number of replies made by users with different number of friends. (B) The average number of mentioned users as a function of number of followees.
Fig 6
Fig 6. Exposure hypothesis and its variations.
The probability of retweeting as a function of the number of followees who have tweeted a post, (A) averaged over all users, (B) by breaking down users into classes based on the number of following friends they have, (C) by breaking down users into classes based on the betweenness in their ego networks, and (D) by breaking down users into classes based on the clustering coefficient in their ego networks. Medians of betweenness and clustering coefficient are used as cut-points for grouping users.

References

    1. Lazer D, Pentland AS, Adamic L, Aral S, Barabasi AL, Brewer D, et al. Life in the network: The coming age of computational social science. Science. 2009; 323(5915): 721–723. - PMC - PubMed
    1. Strohmaier M, Wagner C. Computational Social Science for the World Wide Web. IEEE Intelligent Systems. 2014; 29(5): 84–88.
    1. Watts DJ. Computational social science: Exciting progress and future directions. The Bridge on Frontiers of Engineering. 2013; 43(4): 5–10.
    1. Golder SA, Macy MW. Digital footprints: Opportunities and challenges for online social research. Annual Review of Sociology. 2014; 40(1): 129–152.
    1. Kwak H, Lee C, Park H, Moon S, editors. What is Twitter, a social network or a news media? Proceedings of the 19th international conference on World wide web; 2010: ACM.

Publication types

LinkOut - more resources