Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 5:13:962132.
doi: 10.3389/fphys.2022.962132. eCollection 2022.

Replacing statistical significance and non-significance with better approaches to sampling uncertainty

Affiliations

Replacing statistical significance and non-significance with better approaches to sampling uncertainty

Will G Hopkins. Front Physiol. .

Abstract

A sample provides only an approximate estimate of the magnitude of an effect, owing to sampling uncertainty. The following methods address the issue of sampling uncertainty when researchers make a claim about effect magnitude: informal assessment of the range of magnitudes represented by the confidence interval; testing of hypotheses of substantial (meaningful) and non-substantial magnitudes; assessment of the probabilities of substantial and trivial (inconsequential) magnitudes with Bayesian methods based on non-informative or informative priors; and testing of the nil or zero hypothesis. Assessment of the confidence interval, testing of substantial and non-substantial hypotheses, and assessment of Bayesian probabilities with a non-informative prior are subject to differing interpretations but are all effectively equivalent and can reasonably define and provide necessary and sufficient evidence for substantial and trivial effects. Informative priors in Bayesian assessments are problematic, because they are hard to quantify and can bias the outcome. Rejection of the nil hypothesis (presented as statistical significance), and failure to reject the nil hypothesis (presented as statistical non-significance), provide neither necessary nor sufficient evidence for substantial and trivial effects. To properly account for sampling uncertainty in effect magnitudes, researchers should therefore replace rather than supplement the nil-hypothesis test with one or more of the other three equivalent methods. Surprisal values, second-generation p values, and the hypothesis comparisons of evidential statistics are three other recent approaches to sampling uncertainty that are not recommended. Important issues beyond sampling uncertainty include representativeness of sampling, accuracy of the statistical model, individual differences, individual responses, and rewards of benefit and costs of harm of clinically or practically important interventions and side effects.

Keywords: Bayesian inference; confidence interval; effect magnitude; magnitude-based inference; sampling uncertainty; significance test.

PubMed Disclaimer

Conflict of interest statement

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Conclusions about effects determined by coverage of the confidence or compatibility interval (CI), by tests based on rejection of one-sided interval hypotheses, and by Bayesian probabilities, for six qualitatively different dispositions of 90% CI (bars) relative to substantial and trivial magnitudes. +ive, substantial positive; −ive, substantial negative.
FIGURE 2
FIGURE 2
Compatibility intervals (thin bars, 95%; thick bars, 90%) illustrating significant effects where it would be appropriate (=) and inappropriate (≠) to conclude the effect is substantial, and non-significant effects where it would be appropriate and inappropriate to conclude the effect is trivial. The arrowhead indicates both compatibility intervals extending much further to the left.

References

    1. Aisbett J. (2020). Conclusions largely unrelated to findings of the systematic review: Comment on "Systematic review of the use of “magnitude-based inference” in sports science and medicine. PLoS One 15. 10.1371/journal.pone.0235318 - DOI - PMC - PubMed
    1. Albers C. J., Kiers H. A., Van Ravenzwaaij D. (2018). Credible confidence: A pragmatic view on the frequentist vs bayesian debate. Collabra Psychol. 4, 31. 10.1525/collabra.149 - DOI
    1. Allen I. E., Seaman C. A. (2007). Superiority, equivalence and non-inferiority. Qual. Prog. 40, 52–54.
    1. Amrhein V., Greenland S., Mcshane B. (2019). Scientists rise up against statistical significance. Nature 567, 305–307. 10.1038/d41586-019-00857-9 - DOI - PubMed
    1. Barker R. J., Schofield M. R. (2008). Inference about magnitudes of effects. Int. J. Sports Physiol. Perform. 3 (4), 547–557. 10.1123/ijspp.3.4.547 - DOI - PubMed

LinkOut - more resources