Reducing Bias and Error in the Correlation Coefficient Due to Nonnormality
- PMID: 29795841
- PMCID: PMC5965513
- DOI: 10.1177/0013164414557639
Reducing Bias and Error in the Correlation Coefficient Due to Nonnormality
Abstract
It is more common for educational and psychological data to be nonnormal than to be approximately normal. This tendency may lead to bias and error in point estimates of the Pearson correlation coefficient. In a series of Monte Carlo simulations, the Pearson correlation was examined under conditions of normal and nonnormal data, and it was compared with its major alternatives, including the Spearman rank-order correlation, the bootstrap estimate, the Box-Cox transformation family, and a general normalizing transformation (i.e., rankit), as well as to various bias adjustments. Nonnormality caused the correlation coefficient to be inflated by up to +.14, particularly when the nonnormality involved heavy-tailed distributions. Traditional bias adjustments worsened this problem, further inflating the estimate. The Spearman and rankit correlations eliminated this inflation and provided conservative estimates. Rankit also minimized random error for most sample sizes, except for the smallest samples (n = 10), where bootstrapping was more effective. Overall, results justify the use of carefully chosen alternatives to the Pearson correlation when normality is violated.
Keywords: Pearson; Spearman; correlation; nonnormal; normality; transformation.
Conflict of interest statement
Declaration of Conflicting Interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Figures
References
-
- Arndt S., Turvey C., Andreasen N. C. (1999). Correlating and predicting psychiatric symptom ratings: Spearman’s r versus Kendall’s tau correlation. Journal of Psychiatric Research, 33, 97-104. - PubMed
-
- Beasley W. H., Rodgers J. L. (2009). Resampling methods. In Millsap R. E., Maydeu-Olivares A. (Eds.), The SAGE handbook of quantitative methods in psychology (pp. 362-386). London, England: SAGE.
-
- Berry G. L. (1981). The Weibull distribution as a human performance descriptor. IEEE Transactions on Systems, Man, & Cybernetics, 11, 501-504. doi: 10.1109/TSMC.1981.4308727 - DOI
LinkOut - more resources
Full Text Sources
Other Literature Sources