Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov 16;118(46):e2100786118.
doi: 10.1073/pnas.2100786118.

Comparing information diffusion mechanisms by matching on cascade size

Affiliations

Comparing information diffusion mechanisms by matching on cascade size

Jonas L Juul et al. Proc Natl Acad Sci U S A. .

Abstract

Do some types of information spread faster, broader, or further than others? To understand how information diffusions differ, scholars compare structural properties of the paths taken by content as it spreads through a network, studying so-called cascades. Commonly studied cascade properties include the reach, depth, breadth, and speed of propagation. Drawing conclusions from statistical differences in these properties can be challenging, as many properties are dependent. In this work, we demonstrate the essentiality of controlling for cascade sizes when studying structural differences between collections of cascades. We first revisit two datasets from notable recent studies of online diffusion that reported content-specific differences in cascade topology: an exhaustive corpus of Twitter cascades for verified true- or false-news content by Vosoughi et al. [S. Vosoughi, D. Roy, S. Aral. Science 359, 1146-1151 (2018)] and a comparison of Twitter cascades of videos, pictures, news, and petitions by Goel et al. [S. Goel, A. Anderson, J. Hofman, D. J. Watts. Manage. Sci. 62, 180-196 (2016)]. Using methods that control for joint cascade statistics, we find that for false- and true-news cascades, the reported structural differences can almost entirely be explained by false-news cascades being larger. For videos, images, news, and petitions, structural differences persist when controlling for size. Studying classical models of diffusion, we then give conditions under which differences in structural properties under different models do or do not reduce to differences in size. Our findings are consistent with the mechanisms underlying true- and false-news diffusion being quite similar, differing primarily in the basic infectiousness of their spreading process.

Keywords: information diffusion; misinformation; network analysis; social media.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interest.

Figures

Fig. 1.
Fig. 1.
(AE) Structural and temporal statistics of false-news and true-news cascades diffusing on Twitter, as presented in ref. . Cascades in the two datasets have different size distributions (A). (FJ) The same analyses as the plots directly above, carried out for two subsampled datasets with matched size distributions. Controlling for size collapses statistical differences in these properties. Insets depict each statistic on a simple cascade.
Fig. 2.
Fig. 2.
(AC) Structural statistics of videos, images, news, and petitions on Twitter, as presented by ref. . Note that only cascades containing at least 100 posts are included in the analysis. Cascades in the four categories of content have different size distributions (A). (DF) The same analyses as the plots directly above, carried out for subsampled datasets with matched size distributions. Controlling for size does not collapse statistical differences in these properties. Insets depict each statistic on a simple cascade.
Fig. 3.
Fig. 3.
(AF) Structural statistics of datasets of cascades simulated using the SIR and IC models. (A and B) Size and maximum breadth of SIR cascades with two different values of the infectivity parameter R0. (C and D) Size and maximum breadth of IC cascades with two different values of R0. (E and F) Size and maximum breadth of SIR vs. IC cascades with the same choice of R0=0.8. (GL) The same analyses as the plots directly above, carried out for two subsampled datasets with matched size distributions. Controlling for size collapses statistical differences in structural properties when simulations come from the same underlying model (IC or SIR), even for different choices of infectivity R0. The collapse does not happen if the underlying models are different. Insets again depict each statistic on a simple cascade. Only size and breadth are shown here due to space constraints; the collapses of the remaining statistical quantities are shown in SI Appendix, Figs. S14, S18, and S22.
Fig. 4.
Fig. 4.
(A) CCDF of out-degree distributions in the Vosoughi et al. dataset (11) of false-news and true-news cascades on Twitter. (B) CCDF of out-degree distributions in SIR cascades with two different values of the infectivity parameter R0. (C) CCDF of out-degree distributions in SIR cascades and IC cascades with the same choice of R0=0.8. (DF) The same analyses as the plots directly above, carried out for two subsampled datasets with matched size distributions. Controlling for size collapses statistical differences in structural properties for the datasets of true and false news and the simulated data created under the same model with different parameter settings. The collapse does not happen if the underlying models differ. We show KS-test statistics for 1,000 instances of size-matched datasets in SI Appendix, section S-I.

References

    1. F. S. Chapin, Cultural Change (The Century Company, New York, 1928).
    1. Ryan B., Gross N. C., The diffusion of hybrid seed corn in two Iowa communities. Rural Sociol. 8, 15 (1943).
    1. Rogers E. M., Diffusion of Innovations (Free Press of Glencoe, New York, 1962).
    1. Gruhl D., Guha R., Liben-Nowell D., Tomkins A., “Information diffusion through blogspace” in Proceedings of the 13th International Conference on World Wide Web (Association for Computing Machinery, New York, 2004), pp. 491–501.
    1. Adar E., Adamic L. A., “Tracking information epidemics in blogspace” in 2005 ACM International Conference on Web Intelligence (WI’05) (IEEE, 2005), pp. 207–214.

Publication types

LinkOut - more resources