Replacing statistical significance and non-significance with better approaches to sampling uncertainty

Will G Hopkins¹

Affiliations

PMID: 36267575
PMCID: PMC9578285
DOI: 10.3389/fphys.2022.962132

Replacing statistical significance and non-significance with better approaches to sampling uncertainty

Will G Hopkins. Front Physiol. 2022.

. 2022 Sep 5:13:962132.

doi: 10.3389/fphys.2022.962132. eCollection 2022.

Author

Will G Hopkins¹

Affiliation

¹ Institute for Health and Sport, Victoria University, Melbourne, VIC, Australia.

PMID: 36267575
PMCID: PMC9578285
DOI: 10.3389/fphys.2022.962132

Abstract

A sample provides only an approximate estimate of the magnitude of an effect, owing to sampling uncertainty. The following methods address the issue of sampling uncertainty when researchers make a claim about effect magnitude: informal assessment of the range of magnitudes represented by the confidence interval; testing of hypotheses of substantial (meaningful) and non-substantial magnitudes; assessment of the probabilities of substantial and trivial (inconsequential) magnitudes with Bayesian methods based on non-informative or informative priors; and testing of the nil or zero hypothesis. Assessment of the confidence interval, testing of substantial and non-substantial hypotheses, and assessment of Bayesian probabilities with a non-informative prior are subject to differing interpretations but are all effectively equivalent and can reasonably define and provide necessary and sufficient evidence for substantial and trivial effects. Informative priors in Bayesian assessments are problematic, because they are hard to quantify and can bias the outcome. Rejection of the nil hypothesis (presented as statistical significance), and failure to reject the nil hypothesis (presented as statistical non-significance), provide neither necessary nor sufficient evidence for substantial and trivial effects. To properly account for sampling uncertainty in effect magnitudes, researchers should therefore replace rather than supplement the nil-hypothesis test with one or more of the other three equivalent methods. Surprisal values, second-generation p values, and the hypothesis comparisons of evidential statistics are three other recent approaches to sampling uncertainty that are not recommended. Important issues beyond sampling uncertainty include representativeness of sampling, accuracy of the statistical model, individual differences, individual responses, and rewards of benefit and costs of harm of clinically or practically important interventions and side effects.

Keywords: Bayesian inference; confidence interval; effect magnitude; magnitude-based inference; sampling uncertainty; significance test.

PubMed Disclaimer

Conflict of interest statement

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**FIGURE 1**
Conclusions about effects determined by coverage of the confidence or compatibility interval (CI), by tests based on rejection of one-sided interval hypotheses, and by Bayesian probabilities, for six qualitatively different dispositions of 90% CI (bars) relative to substantial and trivial magnitudes. +ive, substantial positive; −ive, substantial negative.

**FIGURE 2**
Compatibility intervals (thin bars, 95%; thick bars, 90%) illustrating significant effects where it would be appropriate (=) and inappropriate (≠) to conclude the effect is substantial, and non-significant effects where it would be appropriate and inappropriate to conclude the effect is trivial. The arrowhead indicates both compatibility intervals extending much further to the left.

See this image and copyright information in PMC

References

1. Aisbett J. (2020). Conclusions largely unrelated to findings of the systematic review: Comment on "Systematic review of the use of “magnitude-based inference” in sports science and medicine. PLoS One 15. 10.1371/journal.pone.0235318 - DOI - PMC - PubMed
1. Albers C. J., Kiers H. A., Van Ravenzwaaij D. (2018). Credible confidence: A pragmatic view on the frequentist vs bayesian debate. Collabra Psychol. 4, 31. 10.1525/collabra.149 - DOI
1. Allen I. E., Seaman C. A. (2007). Superiority, equivalence and non-inferiority. Qual. Prog. 40, 52–54.
1. Amrhein V., Greenland S., Mcshane B. (2019). Scientists rise up against statistical significance. Nature 567, 305–307. 10.1038/d41586-019-00857-9 - DOI - PubMed
1. Barker R. J., Schofield M. R. (2008). Inference about magnitudes of effects. Int. J. Sports Physiol. Perform. 3 (4), 547–557. 10.1123/ijspp.3.4.547 - DOI - PubMed

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Replacing statistical significance and non-significance with better approaches to sampling uncertainty

Affiliation

Replacing statistical significance and non-significance with better approaches to sampling uncertainty

Author

Affiliation

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources