The ongoing tyranny of statistical significance testing in biomedical research
- PMID: 20339903
- DOI: 10.1007/s10654-010-9440-x
The ongoing tyranny of statistical significance testing in biomedical research
Abstract
Since its introduction into the biomedical literature, statistical significance testing (abbreviated as SST) caused much debate. The aim of this perspective article is to review frequent fallacies and misuses of SST in the biomedical field and to review a potential way out of the fallacies and misuses associated with SSTs. Two frequentist schools of statistical inference merged to form SST as it is practised nowadays: the Fisher and the Neyman-Pearson school. The P-value is both reported quantitatively and checked against the alpha-level to produce a qualitative dichotomous measure (significant/nonsignificant). However, a P-value mixes the estimated effect size with its estimated precision. Obviously, it is not possible to measure these two things with one single number. For the valid interpretation of SSTs, a variety of presumptions and requirements have to be met. We point here to four of them: study size, correct statistical model, correct causal model, and absence of bias and confounding. It has been stated that the P-value is perhaps the most misunderstood statistical concept in clinical research. As in the social sciences, the tyranny of SST is still highly prevalent in the biomedical literature even after decades of warnings against SST. The ubiquitous misuse and tyranny of SST threatens scientific discoveries and may even impede scientific progress. In the worst case, misuse of significance testing may even harm patients who eventually are incorrectly treated because of improper handling of P-values. For a proper interpretation of study results, both estimated effect size and estimated precision are necessary ingredients.
Comment in
-
Re: The ongoing tyranny of statistical significance testing in biomedical research.Eur J Epidemiol. 2010 Nov;25(11):843; author reply 844-5. doi: 10.1007/s10654-010-9507-8. Epub 2010 Nov 20. Eur J Epidemiol. 2010. Corrected and republished in: Eur J Epidemiol. 2010 Dec;25(12):899-900. doi: 10.1007/s10654-010-9537-2. PMID: 21103912 Corrected and republished. No abstract available.
-
Erratum to: Letter to the Editor: The ongoing tyranny of statistical significance testing in biomedical research.Eur J Epidemiol. 2010 Dec;25(12):899-900. doi: 10.1007/s10654-010-9537-2. Eur J Epidemiol. 2010. PMID: 21190126 No abstract available.
Similar articles
-
Erratum to: Letter to the Editor: The ongoing tyranny of statistical significance testing in biomedical research.Eur J Epidemiol. 2010 Dec;25(12):899-900. doi: 10.1007/s10654-010-9537-2. Eur J Epidemiol. 2010. PMID: 21190126 No abstract available.
-
Re: The ongoing tyranny of statistical significance testing in biomedical research.Eur J Epidemiol. 2010 Nov;25(11):843; author reply 844-5. doi: 10.1007/s10654-010-9507-8. Epub 2010 Nov 20. Eur J Epidemiol. 2010. Corrected and republished in: Eur J Epidemiol. 2010 Dec;25(12):899-900. doi: 10.1007/s10654-010-9537-2. PMID: 21103912 Corrected and republished. No abstract available.
-
Misconceptions, Misuses, and Misinterpretations of P Values and Significance Testing.J Bone Joint Surg Am. 2017 Sep 20;99(18):1598-1603. doi: 10.2106/JBJS.16.01314. J Bone Joint Surg Am. 2017. PMID: 28926390
-
Understanding the effect size and its measures.Biochem Med (Zagreb). 2016;26(2):150-63. doi: 10.11613/BM.2016.015. Biochem Med (Zagreb). 2016. PMID: 27346958 Free PMC article. Review.
-
[The uncertainties of statistical "significance"].Rev Med Chil. 2018 Dec;146(10):1184-1189. doi: 10.4067/S0034-98872018001001184. Rev Med Chil. 2018. PMID: 30724983 Review. Spanish.
Cited by
-
Sex differences in gray matter volume: how many and how large are they really?Biol Sex Differ. 2019 Jul 1;10(1):32. doi: 10.1186/s13293-019-0245-7. Biol Sex Differ. 2019. PMID: 31262342 Free PMC article.
-
The reporting of p values, confidence intervals and statistical significance in Preventive Veterinary Medicine (1997-2017).PeerJ. 2021 Nov 24;9:e12453. doi: 10.7717/peerj.12453. eCollection 2021. PeerJ. 2021. PMID: 34900418 Free PMC article.
-
Factors Associated with Adherence and Concordance Between Measurement Strategies in an HIV Daily Oral Tenofovir/Emtricitibine as Pre-exposure Prophylaxis (Prep) Clinical Trial, Botswana, 2007-2010.AIDS Behav. 2015 May;19(5):758-69. doi: 10.1007/s10461-014-0891-z. AIDS Behav. 2015. PMID: 25186785 Free PMC article. Clinical Trial.
-
Increased attentional network functioning related to symptom severity measures in females with irritable bowel syndrome.Neurogastroenterol Motil. 2015 Sep;27(9):1282-94. doi: 10.1111/nmo.12622. Epub 2015 Jun 19. Neurogastroenterol Motil. 2015. PMID: 26087779 Free PMC article.
-
A novel approach to quantify random error explicitly in epidemiological studies.Eur J Epidemiol. 2011 Dec;26(12):899-902. doi: 10.1007/s10654-011-9605-2. Epub 2011 Jul 30. Eur J Epidemiol. 2011. PMID: 21805167 Free PMC article.
References
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials
Miscellaneous