Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep 3;7(36):eabf4393.
doi: 10.1126/sciadv.abf4393. Epub 2021 Sep 1.

Scaling up fact-checking using the wisdom of crowds

Affiliations

Scaling up fact-checking using the wisdom of crowds

Jennifer Allen et al. Sci Adv. .

Abstract

Professional fact-checking, a prominent approach to combating misinformation, does not scale easily. Furthermore, some distrust fact-checkers because of alleged liberal bias. We explore a solution to these problems: using politically balanced groups of laypeople to identify misinformation at scale. Examining 207 news articles flagged for fact-checking by Facebook algorithms, we compare accuracy ratings of three professional fact-checkers who researched each article to those of 1128 Americans from Amazon Mechanical Turk who rated each article’s headline and lede. The average ratings of small, politically balanced crowds of laypeople (i) correlate with the average fact-checker ratings as well as the fact-checkers’ ratings correlate with each other and (ii) predict whether the majority of fact-checkers rated a headline as “true” with high accuracy. Furthermore, cognitive reflection, political knowledge, and Democratic Party preference are positively related to agreement with fact-checkers, and identifying each headline’s publisher leads to a small increase in agreement with fact-checkers.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.. Correlation across articles between (i) politically balanced layperson aggregate accuracy ratings based on reading the headline and lede and (ii) average fact-checker research-based aggregate accuracy ratings, as a function of the number of layperson ratings per article.
Laypeople are grouped by condition (Source versus No Source). For results showing up to 100 laypeople per article, see fig. S7; for results using the average of the correlations with a single fact-checker rather than the correlation with the average of the fact-checker ratings, see fig. S9. Panels show results for (A) all articles, (B) nonpolitical articles, and (C) political articles. The dashed line indicates the average Pearson correlation between fact-checkers (all articles, r = 0.62; nonpolitical articles, r = 0.69; political articles, r = 0.56). Error bars indicate 95% confidence intervals.
Fig. 2.
Fig. 2.. Classifying articles as true versus non-true based on layperson aggregate Likert ratings.
(A and B) AUC scores as a function of the number of layperson ratings per article and source condition. AUC is calculated using a model in which the average layperson aggregate Likert rating is used to predict the modal fact-checker categorical rating, where the fact-checker rating is coded as “1” if the modal rating is “True” and “0” otherwise. Error bars indicate 95% confidence intervals. For full receiver operating characteristics curves using a politically balanced crowd of size 26, see section S14. (C) Out-of-sample accuracy for ratings from a politically-balanced crowd of size 26 given source information, calculated separately for unanimous and non-unanimous headlines, as the proportion of unanimous headlines in the sample increases. (D) Cutoff for classifying an article as “True” as the proportion of unanimous headlines in the sample increases.
Fig. 3.
Fig. 3.. Comparing crowds with different layperson compositions to a baseline, politically balanced crowd.
(A) Pearson correlations between the average aggregate accuracy rating of a crowd of size 26 and the average aggregate accuracy rating of the fact-checkers. (B) AUC for the average aggregate accuracy rating of a crowd of size 26 predicting whether the modal fact-checker categorical rating is true. For both (A) and (B), we compare the baseline to a crowd of only Democrats versus only Republicans, a politically balanced crowd of participants who scored above the median on the CRT versus at or below the median on the CRT, and a politically balanced crowd of participants who score above the median on political knowledge versus at or below the median on political knowledge. Means and CIs are generated using bootstraps with 1000 iterations (see section S2 for details). For analysis comparing political to nonpolitical headlines, see section S13.

References

    1. Lazer D. M. J., Baum M. A., Benkler Y., Berinsky A. J., Greenhill K. M., Menczer F., Metzger M. J., Nyhan B., Pennycook G., Rothschild D., Schudson M., Sloman S. A., Sunstein C. R., Thorson E. A., Watts D. J., Zittrain J. L., The science of fake news. Science 359, 1094–1096 (2018). - PubMed
    1. Facebook, Facebook’s third-party fact-checking program; https://www.facebook.com/journalismproject/programs/third-party-fact-che....
    1. Twitter, Updating our approach to misleading information; https://blog.twitter.com/en_us/topics/product/2020/updating-our-approach....
    1. Nieminen S., Rapeli L., Fighting misperceptions and doubting journalists’ objectivity: A review of fact-checking literature. Polit. Stud. Rev. 17, 296–309 (2019).
    1. Wood T., Porter E., The elusive backfire effect: Mass attitudes’ steadfast factual adherence. Polit. Behav. 41, 135–163 (2019).