Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Sep 21;3(9):160384.
doi: 10.1098/rsos.160384. eCollection 2016 Sep.

The natural selection of bad science

Affiliations

The natural selection of bad science

Paul E Smaldino et al. R Soc Open Sci. .

Erratum in

Abstract

Poor research design and data analysis encourage false-positive findings. Such poor methods persist despite perennial calls for improvement, suggesting that they result from something more than just misunderstanding. The persistence of poor methods results partly from incentives that favour them, leading to the natural selection of bad science. This dynamic requires no conscious strategizing-no deliberate cheating nor loafing-by scientists, only that publication is a principal factor for career advancement. Some normative methods of analysis have almost certainly been selected to further publication instead of discovery. In order to improve the culture of science, a shift must be made away from correcting misunderstandings and towards rewarding understanding. We support this argument with empirical evidence and computational modelling. We first present a 60-year meta-analysis of statistical power in the behavioural sciences and show that power has not improved despite repeated demonstrations of the necessity of increasing power. To demonstrate the logical consequences of structural incentives, we then present a dynamic model of scientific communities in which competing laboratories investigate novel or previously published hypotheses using culturally transmitted research methods. As in the real world, successful labs produce more 'progeny,' such that their methods are more often copied and their students are more likely to start labs of their own. Selection for high output leads to poorer methods and increasingly high false discovery rates. We additionally show that replication slows but does not stop the process of methodological deterioration. Improving the quality of research requires change at the institutional level.

Keywords: Campbell’s Law; cultural evolution; incentives; metascience; replication; statistical power.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Average statistical power from 44 reviews of papers published in journals in the social and behavioural sciences between 1960 and 2011. Data are power to detect small effect sizes (d=0.2), assuming a false-positive rate of α=0.05, and indicate both very low power (mean=0.24) but also no increase over time (R2=0.00097).
Figure 2.
Figure 2.
The relationship between power and false-positive rate, modified by effort, e. Runs analysed in this paper were initialized with e0=75 (shown in orange), such that α=0.05 when power is 0.8.
Figure 3.
Figure 3.
Power evolves. The evolution of mean power (W), false-positive rate (α) and false discovery rate (FDR).
Figure 4.
Figure 4.
Effort evolves. The evolution of low mean effort corresponds to evolution of high false-positive and false discovery rates.
Figure 5.
Figure 5.
The coevolution of effort and replication.
Figure 6.
Figure 6.
The evolution of effort when zero, 25% or 50% of all studies performed are replications.
Figure 7.
Figure 7.
Lab pay-offs from the non-evolutionary model. Each graph shows count distributions for high and low effort labs’ total pay-offs after 110 time steps, 100 of which included replication. (ac) Total count for each pay-off is totalled from 50 runs for each condition. Panel (c) includes an inset that displays the same data as the larger graph, but for a narrower range of pay-offs. The punishment for having one’s novel result fail to replicate is orders of magnitude greater than the benefit of publishing, reflected in the discrete peaks in (b) and (c).

Comment in

References

    1. Campbell DT. 1976. Assessing the impact of planned social change. Hanover, NH: The Public Affairs Center.
    1. Wasserstein RL, Lazar NA. 2016. The ASA’s statement on p-values: context, process, and purpose. Am. Stat. 70, 129–133. (doi:10.1080/00031305.2016.1154108) - DOI
    1. Meehl PE. 1967. Theory-testing in psychology and physics: a methodological paradox. Phil. Sci. 34, 103–115. (doi:10.1086/288135) - DOI
    1. Cohen J. 1994. The earth is round (p<.05). Am. Psychol. 49, 997–1003. (doi:10.1037/0003-066X.49.12.997) - DOI
    1. Enserink M. 2012. Final report on Stapel blames field as a whole. Science 338, 1270–1271. (doi:10.1126/science.338.6112.1270) - DOI - PubMed