Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Mar 11;4(1):18.
doi: 10.1186/s41747-020-0145-y.

Statistical significance: p value, 0.05 threshold, and applications to radiomics-reasons for a conservative approach

Affiliations

Statistical significance: p value, 0.05 threshold, and applications to radiomics-reasons for a conservative approach

Giovanni Di Leo et al. Eur Radiol Exp. .

Abstract

Here, we summarise the unresolved debate about p value and its dichotomisation. We present the statement of the American Statistical Association against the misuse of statistical significance as well as the proposals to abandon the use of p value and to reduce the significance threshold from 0.05 to 0.005. We highlight reasons for a conservative approach, as clinical research needs dichotomic answers to guide decision-making, in particular in the case of diagnostic imaging and interventional radiology. With a reduced p value threshold, the cost of research could increase while spontaneous research could be reduced. Secondary evidence from systematic reviews/meta-analyses, data sharing, and cost-effective analyses are better ways to mitigate the false discovery rate and lack of reproducibility associated with the use of the 0.05 threshold. Importantly, when reporting p values, authors should always provide the actual value, not only statements of "p < 0.05" or "p ≥ 0.05", because p values give a measure of the degree of data compatibility with the null hypothesis. Notably, radiomics and big data, fuelled by the application of artificial intelligence, involve hundreds/thousands of tested features similarly to other "omics" such as genomics, where a reduction in the significance threshold, based on well-known corrections for multiple testing, has been already adopted.

Keywords: Confidence intervals; Decision making; Models (statistical); Radiomics; Reproducibility of results.

PubMed Disclaimer

Conflict of interest statement

FS is the Editor-in-Chief of European Radiology Experimental. The manuscript has been managed by the Deputy Editor, Prof. Akos Varga-Szemes. In addition, FS declares to have received grants from or to be member of speakers’ bureau/advisory board for Bayer, Bracco, and General Electric. The remaining author declares that there are no competing interests.

Similar articles

Cited by

References

    1. Amrhein V, Greenland S, McShane B. Scientists rise up against statistical significance. Nature. 2019;567:305–307. doi: 10.1038/d41586-019-00857-9. - DOI - PubMed
    1. Ioannidis JPA. The importance of predefined rules and prespecified statistical analyses: do not abandon significance. JAMA. 2019;321:2067–2068. doi: 10.1001/jama.2019.4582. - DOI - PubMed
    1. Berkson J. Tests of significance considered as evidence. J Am Stat Assoc. 1942;37:325–335. doi: 10.1080/01621459.1942.10501760. - DOI - PubMed
    1. Benjamin DJ, Berger JO, Johnson VE, et al. Redefine statistical significance. Nat Hum Behav. 2018;2:6–10. doi: 10.1038/s41562-017-0189-z. - DOI - PubMed
    1. Wasserstein RL, Lazar NA. The ASA’s statement on p-values: context, process, and purpose. Am Stat. 2016;70:129–133. doi: 10.1080/00031305.2016.1154108. - DOI

Publication types