Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Mar 31;11(3):e0152719.
doi: 10.1371/journal.pone.0152719. eCollection 2016.

Statistically Controlling for Confounding Constructs Is Harder than You Think

Affiliations

Statistically Controlling for Confounding Constructs Is Harder than You Think

Jacob Westfall et al. PLoS One. .

Abstract

Social scientists often seek to demonstrate that a construct has incremental validity over and above other related constructs. However, these claims are typically supported by measurement-level models that fail to consider the effects of measurement (un)reliability. We use intuitive examples, Monte Carlo simulations, and a novel analytical framework to demonstrate that common strategies for establishing incremental construct validity using multiple regression analysis exhibit extremely high Type I error rates under parameter regimes common in many psychological domains. Counterintuitively, we find that error rates are highest--in some cases approaching 100%--when sample sizes are large and reliability is moderate. Our findings suggest that a potentially large proportion of incremental validity claims made in the literature are spurious. We present a web application (http://jakewestfall.org/ivy/) that readers can use to explore the statistical properties of these and other incremental validity arguments. We conclude by reviewing SEM-based statistical approaches that appropriately control the Type I error rate when attempting to establish incremental validity.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Plot of subjective heat ratings on a 7-point Likert scale against the “true” underlying daily temperatures.
Fig 2
Fig 2. Illustration of residual confounding.
(A) Simple relationship between daily swimming pool deaths and number of ice cream cones sold. (B) Relationship between daily swimming pool deaths and number of ice cream cones sold after controlling for subjective heat Likert ratings. (C) Relationship between daily swimming pool deaths and number of ice cream cones sold after controlling for recorded daily temperatures.
Fig 3
Fig 3. Contour plots of Type 1 error probabilities for the argument for predictive utility.
The null hypothesis is that T1 has no partial relationship with Y after controlling for T2 (i.e., ρ1.2 = 0). The size of the true indirect effect of T1 on Y via T2 varies from small (panel A) to medium (panel B) to large (panel C).
Fig 4
Fig 4. Contour plots of Type 1 error probabilities for the argument for separable constructs.
The alternative hypothesis is that both of the predictors are separately related to the outcome, which implies the null hypothesis that either of the predictors is not related to the outcome. The magnitude of the true correlation between Y and T varies from small (panel A) to medium (panel B) to large (panel C).
Fig 5
Fig 5. Contour plots of Type 1 error probabilities for the argument for improved measurement.
The null hypothesis is that the two predictors have the same partial correlation with the outcome. The magnitude of the true partial correlation varies from small (panel A) to medium (panel B) to large (panel C). Varying δ does not have a very big impact on the error rates, so we fix it at δ = .5 in all three panels.
Fig 6
Fig 6. Test statistics from models regressing BRI outcomes on both the NEO and HEXACO versions of a factor.
The test statistics are t-statistics for the regression models and z-statistics for the SEM models. BRI = Behavioral Report Inventory. SEM = Structural Equation Model.
Fig 7
Fig 7. Test statistics from models regressing BRI outcomes on both the NEO and HEXACO versions of a factor.
The test statistics are t-statistics for the regression models and z-statistics for the SEM models. BRI = Behavioral Report Inventory. SEM = Structural Equation Model.
Fig 8
Fig 8. Test statistics from models predicting BRI outcomes.
The test statistics are t-statistics for the regression models and z-statistics for the SEM models. BRI = Behavioral Report Inventory. SEM = Structural Equation Model.
Fig 9
Fig 9. Path diagram for a SEM predicting drug use, allowing for specified degrees of reliability in the observed NEO and HEXACO scores.
Circle nodes represent latent variables, square nodes represent observed variables, solid lines represent paths or variances to be estimated from the data, and dashed lines represent paths or variances that are fixed to constant, a priori values. SEM = Structural Equation Model.
Fig 10
Fig 10. Test statistics as a function of assumed reliability.
The shaded region gives the range within which the test statistics are nonsignificant. In each model, assuming reliabilities below a certain value invariably caused the model to fail to converge or to yield an inadmissible solution (i.e., impossible correlation matrices for the latent variables); we only plot the results for reliability values that successfully converge on stable estimates.
Fig 11
Fig 11. Incremental validity in multiple regression vs. SEM.
The SEM results are from a simulation using 300,000 iterations. The multiple regression results are computed analytically. The SEM line in the left panel is a smoothed curve derived from fitting a generalized additive model with a binomial response to the simulation results tracking whether the null hypothesis was rejected. In the right panel, the SEM line and shaded region are based on first applying rolling medians of width 101 to the simulated regression coefficients and standard errors (to reduce the distorting influence of extreme outlying parameter estimates occurring particularly at low reliability values), and then fitting a generalized additive model to these rolling medians. SEM = Structural Equation Model.
Fig 12
Fig 12. Power to detect incremental validity using SEM.
The lines in each panel are smoothed curves derived from fitting generalized additive models with a binomial response to the simulation results. SEM = Structural Equation Model.

Similar articles

Cited by

References

    1. Hunsley J, Meyer GJ. The Incremental Validity of Psychological Testing and Assessment: Conceptual, Methodological, and Statistical Issues. Psychol Assess. 2003;15(4):446–55. - PubMed
    1. Sechrest L. Incremental validity: A recommendation. Educ Psychol Meas. 1963;23(1):153–8.
    1. Armstrong BG. Effect of measurement error on epidemiological studies of environmental and occupational exposures. Occup Environ Med. 1998. October;55(10):651–6. - PMC - PubMed
    1. Christenfeld NJS, Sloan RP, Carroll D, Greenland S. Risk factors, confounding, and the illusion of statistical control. Psychosom Med. 2004. December;66(6):868–75. - PubMed
    1. Fewell Z, Smith GD, Sterne JAC. The impact of residual and unmeasured confounding in epidemiologic studies: a simulation study. Am J Epidemiol. 2007. September 15;166(6):646–55. - PubMed

LinkOut - more resources