Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 26;15(6):e0235318.
doi: 10.1371/journal.pone.0235318. eCollection 2020.

Systematic review of the use of "magnitude-based inference" in sports science and medicine

Affiliations

Systematic review of the use of "magnitude-based inference" in sports science and medicine

Keith R Lohse et al. PLoS One. .

Abstract

Magnitude-based inference (MBI) is a controversial statistical method that has been used in hundreds of papers in sports science despite criticism from statisticians. To better understand how this method has been applied in practice, we systematically reviewed 232 papers that used MBI. We extracted data on study design, sample size, and choice of MBI settings and parameters. Median sample size was 10 per group (interquartile range, IQR: 8-15) for multi-group studies and 14 (IQR: 10-24) for single-group studies; few studies reported a priori sample size calculations (15%). Authors predominantly applied MBI's default settings and chose "mechanistic/non-clinical" rather than "clinical" MBI even when testing clinical interventions (only 16 studies out of 232 used clinical MBI). Using these data, we can estimate the Type I error rates for the typical MBI study. Authors frequently made dichotomous claims about effects based on the MBI criterion of a "likely" effect and sometimes based on the MBI criterion of a "possible" effect. When the sample size is n = 8 to 15 per group, these inferences have Type I error rates of 12%-22% and 22%-45%, respectively. High Type I error rates were compounded by multiple testing: Authors reported results from a median of 30 tests related to outcomes; and few studies specified a primary outcome (14%). We conclude that MBI has promoted small studies, promulgated a "black box" approach to statistics, and led to numerous papers where the conclusions are not supported by the data. Amidst debates over the role of p-values and significance testing in science, MBI also provides an important natural experiment: we find no evidence that moving researchers away from p-values or null hypothesis significance testing makes them less prone to dichotomization or over-interpretation of findings.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. PRISMA flowchart.
PRISMA flowchart showing the screening of articles through the systematic review process.
Fig 2
Fig 2. Example MBI inferences.
Ten hypothetical results and corresponding MBI inferences, assuming: a trivial range of -0.2 to 0.2 standard deviations, maximum risk of harm of 5%, and equivalent treatment of positive and negative directions (non-clinical MBI). MBI inferences correspond to the locations of the 50% and 90% confidence intervals relative to the negative (or harmful), positive (or beneficial), and trivial ranges. The result is deemed “unclear” if the 90% confidence interval spans the trivial range. Of note, minimal effect testing with α = 0.05, two-sided, would not arrive at conclusions of negative or positive for any of the examples shown. Equivalence testing with α = 0.05 would also fail to conclude equivalent (i.e., trivial difference) for any of the examples shown.
Fig 3
Fig 3. MBI’s Type I error rates.
A and B: Type I error rates for MBI’s “possible” (purple) and “likely” (red) thresholds, as well as standard hypothesis testing at α = 0.05 (blue) as a function of sample size. The statistical comparison is a two-group comparison of means. True effect size = 0, meaning there is no difference between the groups. (A) assumes variance of 0.364, as might arise in a pre-post study, whereas (B) assumes a variance of 1.0, as in a cross-sectional study. Shaded area shows the interquartile range of sample sizes of the reviewed studies; vertical reference line is the median sample size. Type I error rates were identical whether calculated mathematically or by simulation with 200,000 repeats (see S1 Appendix). C: MBI results from 5000 simulated trials where variance = 0.364 and n = 10 per group. D: MBI results from 5000 simulated trials where variance = 1.0 and n = 10 per group. Simulations and calculations use the MBI settings that predominate in the literature: trivial range of -0.2 to 0.2; maximum risk of harm of 5%; and equivalent treatment of positive and negative directions.
Fig 4
Fig 4. An example of MBI inferences in practice.
Left Panel (Reproduced from Parfey et al. [50], Fig 2C): Literature example where effects deemed “likely” by MBI are associated with large p-values. Confidence intervals are 95% CIs. Starred values are effects meeting MBI’s “likely” threshold. These results were interpreted as evidence of a difference between groups; the authors concluded: “Individuals with CLBP and PR manifested altered activation patterns during the hollowing maneuver compared to healthy controls.” Right panel: Simulation that shows the MBI inferences that are expected for a study of this type (n = 10 per group, cross-sectional) when the true effect is 0. Note that in both the real example and the simulation, most observed effects larger than 0.5 are deemed “likely”.

References

    1. Batterham AM, Hopkins WG. Making meaningful inferences about magnitudes. Int J Sports Physiol Perform. 2006;1: 50–57. - PubMed
    1. Barker RJ, Schofield MR. Inference About Magnitudes of Effects. Int J Sports Physiol Perform. 2008;3: 547–557. 10.1123/ijspp.3.4.547 - DOI - PubMed
    1. Welsh AH, Knight EJ. “Magnitude-based Inference”: A Statistical Review. Med Sci Sports Exerc. 2015;47: 874–884. 10.1249/MSS.0000000000000451 - DOI - PMC - PubMed
    1. Mengersen KL, Drovandi CC, Robert CP, Pyne DB, Gore CJ. Bayesian Estimation of Small Effects in Exercise and Sports Science. Chen CWS, editor. PLOS ONE. 2016;11: e0147311 10.1371/journal.pone.0147311 - DOI - PMC - PubMed
    1. Sainani KL. The Problem with “Magnitude-based Inference”: Med Sci Sports Exerc. 2018;50: 2166–2176. 10.1249/MSS.0000000000001645 - DOI - PubMed

Publication types

MeSH terms