Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Editorial
. 2018 May;166(1):236-245.
doi: 10.1002/ajpa.23399. Epub 2018 Jan 18.

The continuing misuse of null hypothesis significance testing in biological anthropology

Affiliations
Editorial

The continuing misuse of null hypothesis significance testing in biological anthropology

Richard J Smith. Am J Phys Anthropol. 2018 May.

Abstract

There is over 60 years of discussion in the statistical literature concerning the misuse and limitations of null hypothesis significance tests (NHST). Based on the prevalence of NHST in biological anthropology research, it appears that the discipline generally is unaware of these concerns. The p values used in NHST usually are interpreted incorrectly. A p value indicates the probability of the data given the null hypothesis. It should not be interpreted as the probability that the null hypothesis is true or as evidence for or against any specific alternative to the null hypothesis. P values are a function of both the sample size and the effect size, and therefore do not indicate whether the effect observed in the study is important, large, or small. P values have poor replicability in repeated experiments. The distribution of p values is continuous and varies from 0 to 1.0. The use of a cut-off, generally p ≤ 0.05, to separate significant from nonsignificant results, is an arbitrary dichotomization of continuous variation. In 2016, the American Statistical Association issued a statement of principles regarding the misinterpretation of NHST, the first time it has done so regarding a specific statistical procedure in its 180-year history. Effect sizes and confidence intervals, which can be calculated for any data used to calculate p values, provide more and better information about tested hypotheses than p values and NHST.

Keywords: confidence interval; effect size; falsification; p value; statistical significance.

PubMed Disclaimer

Publication types

MeSH terms

LinkOut - more resources