Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019;73(Suppl 1):271-280.
doi: 10.1080/00031305.2018.1518266. Epub 2019 Mar 20.

The New Statistics for Better Science: Ask How Much, How Uncertain, and What Else is Known

Affiliations

The New Statistics for Better Science: Ask How Much, How Uncertain, and What Else is Known

Robert J Calin-Jageman et al. Am Stat. 2019.

Abstract

The "New Statistics" emphasizes effect sizes, confidence intervals, meta-analysis, and the use of Open Science practices. We present 3 specific ways in which a New Statistics approach can help improve scientific practice: by reducing over-confidence in small samples, by reducing confirmation bias, and by fostering more cautious judgments of consistency. We illustrate these points through consideration of the literature on oxytocin and human trust, a research area that typifies some of the endemic problems that arise with poor statistical practice.

Keywords: Confidence Intervals; Estimation; Meta-Analysis; Open Science; the New Statistics.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:. Two ways of looking at the same data.
This figure compares the NHST approach (A) and the New Statistics approach to visualizing the same data (B). The data is from Kosfeld et al. (2005) on the effect of intranasal oxytocin on dollars invested in a trust game. In A, a bar graph is used to show median trust and standard error for each group. The * indicates a statistically significant difference in a one-tailed test (p = .029). In B, all the individual data is shown (circles). Each circle with an error bar represents the group median along with the 95% CI for the median. The plot emphasizes the effect size, which is the difference between the two groups (marked by a triangle, which is an increase of $1 in median trust). The error bar represents the uncertainty about that estimate; it is the 90% CI of the difference, which is [0.00001, 2.99]. The confidence interval is not symmetrical around the point estimate. See the last section of the paper for technical details on how this data was summarized.
Figure 2:
Figure 2:. Two heuristic approaches to judging consistency.
This figure compares the NHST approach (A) and the New Statistics approach (B) to making heuristic judgements of consistency. The data is from Kosfeld et al. (2005) on the effect of intranasal oxytocin on dollars invested in a trust game and non-trust game that involved only risk. In A, a bar graph is used to show average trust with standard error. The * indicates a statistically significant difference in a one-tailed test (p = .04) for the trust game whereas the n.s. indicates a non-significant test in the non-trust game. On this basis, many researchers erroneously judge the results to be inconsistent. However, a direct test for the interaction of oxytocin and game type is not statistically significant (p = .20 for a standard ANOVA test for interaction). In B we focus on the effect of oxytocin in each context, plotting the difference in mean trust between the oxytocin and the placebo groups, along with the 90% CIs. The strong overlap between the CIs suggests, correctly, that this is not enough data to judge the results inconsistent. In this figure means are compared for ease of analysis. Kosfeld et al. (2005) actually used non-parametric tests and focused on comparing medians, but this also indicates a non-significant interaction between drug and task (p = 0.23). The last section of this paper has technical details on how we re-analyzed the data from Kosefeld et al. (2005).

References

    1. American Psychological Association. (2010). Publication manual of the American Psychological Association. Washington, DC: Author.
    1. Benjamin DJ, Berger JO, Johannesson M, Nosek BA, Wagenmakers E-J, Berk R, … Johnson VE (2018). Redefine statistical significance. Nature Human Behaviour, 2(1), 6–10. 10.1038/s41562-017-0189-z - DOI - PubMed
    1. Bernard C (2017). Editorial: A Message from the Editor-in-Chief. Eneuro, 4(1), ENEURO.0023-17.2017. 10.1523/ENEURO.0023-17.2017 - DOI - PMC - PubMed
    1. Button KS, Ioannidis J. P. a., Mokrysz C, Nosek B. a., Flint J, Robinson ESJ, & Munafò MR (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews. Neuroscience, 14(5), 365–76. 10.1038/nrn3475 - DOI - PubMed
    1. Chavalarias D, Wallach JD, Li AHT, & Ioannidis JPA (2016). Evolution of Reporting P Values in the Biomedical Literature, 1990-2015. JAMA, 315(11), 1141 10.1001/jama.2016.1952 - DOI - PubMed

LinkOut - more resources