Simplifying the Assessment of Measurement Invariance over Multiple Background Variables: Using Regularized Moderated Nonlinear Factor Analysis to Detect Differential Item Functioning

Daniel J Bauer^{1

2}, William C M Belzak¹, Veronica Cole²

Affiliations

¹ Department of Psychology and Neuroscience, The University of North Carolina at Chapel Hill.
² Center for Developmental Science, The University of North Carolina at Chapel Hill.

PMID: 33132679
PMCID: PMC7596881
DOI: 10.1080/10705511.2019.1642754

Simplifying the Assessment of Measurement Invariance over Multiple Background Variables: Using Regularized Moderated Nonlinear Factor Analysis to Detect Differential Item Functioning

Daniel J Bauer et al. Struct Equ Modeling. 2020.

. 2020;27(1):43-55.

doi: 10.1080/10705511.2019.1642754. Epub 2019 Sep 5.

Authors

Daniel J Bauer^{1

2}, William C M Belzak¹, Veronica Cole²

Affiliations

¹ Department of Psychology and Neuroscience, The University of North Carolina at Chapel Hill.
² Center for Developmental Science, The University of North Carolina at Chapel Hill.

PMID: 33132679
PMCID: PMC7596881
DOI: 10.1080/10705511.2019.1642754

Abstract

Determining whether measures are equally valid for all individuals is a core component of psychometric analysis. Traditionally, the evaluation of measurement invariance (MI) involves comparing independent groups defined by a single categorical covariate (e.g., men and women) to determine if there are any items that display differential item functioning (DIF). More recently, Moderated Nonlinear Factor Analysis (MNLFA) has been advanced as an approach for evaluating MI/DIF simultaneously over multiple background variables, categorical and continuous. Unfortunately, conventional procedures for detecting DIF do not scale well to the more complex MNLFA. The current manuscript therefore proposes a regularization approach to MNLFA estimation that penalizes the likelihood for DIF parameters (i.e., rewarding sparse DIF). This procedure avoids the pitfalls of sequential inference tests, is automated for end users, and is shown to perform well in both a small-scale simulation and an empirical validation study.

PubMed Disclaimer

Figures

**Figure 1.**
Percentage of items without Differential Item Functioning (DIF) for which DIF was (a) retained and significant following regularization (light gray); (b) retained following regularization, regardless of significance (medium gray); (c) significant by the item response theory likelihood ratio test (black). The tuning parameter for regularized DIF evaluation was determined by minimizing Bayes’ Information Criterion.

**Figure 2.**
Percentage of items with Differential Item Functioning (DIF) for which DIF was (a) retained and significant following regularization (light gray); (b) retained following regularization, regardless of significance (medium gray); (c) significant by the item response theory likelihood ratio test (black). The tuning parameter for regularized DIF evaluation was determined by minimizing Bayes’ Information Criterion.

**Figure 3.**
Percentage of items without Differential Item Functioning (DIF) for which DIF was (a) retained and significant following regularization (light gray); (b) retained following regularization, regardless of significance (medium gray); (c) significant by the item response theory likelihood ratio test (black). The tuning parameter for regularized DIF evaluation was determined by minimizing Akaike’s Information Criterion.

**Figure 4.**
Percentage of items with Differential Item Functioning (DIF) for which DIF was (a) retained and significant following regularization (light gray); (b) retained following regularization, regardless of significance (medium gray); (c) significant by the item response theory likelihood ratio test (black). The tuning parameter for regularized DIF evaluation was determined by minimizing Akaike’s Information Criterion.

See this image and copyright information in PMC

Cited by

Using Interpretable Machine Learning for Differential Item Functioning Detection in Psychometric Tests.
Kraus EB, Wild J, Hilbert S. Kraus EB, et al. Appl Psychol Meas. 2024 Jul;48(4-5):167-186. doi: 10.1177/01466216241238744. Epub 2024 Mar 11. Appl Psychol Meas. 2024. PMID: 39055539 Free PMC article.
Can severity of substance use be measured across drug classes? Estimating differential item functioning by drug class in two general measures of substance use severity.
Janulis P, Luo J, Tang X, Schalet BD. Janulis P, et al. Drug Alcohol Depend. 2023 Sep 1;250:110877. doi: 10.1016/j.drugalcdep.2023.110877. Epub 2023 Jul 5. Drug Alcohol Depend. 2023. PMID: 37441960 Free PMC article.
Empathy and Autism: Establishing the Structure and Different Manifestations of Empathy in Autistic Individuals Using the Perth Empathy Scale.
Brett JD, Preece DA, Becerra R, Whitehouse A, Maybery MT. Brett JD, et al. J Autism Dev Disord. 2024 Aug 8. doi: 10.1007/s10803-024-06491-3. Online ahead of print. J Autism Dev Disord. 2024. PMID: 39115741
Single- and Multiple-Group Penalized Factor Analysis: A Trust-Region Algorithm Approach with Integrated Automatic Multiple Tuning Parameter Selection.
Geminiani E, Marra G, Moustaki I. Geminiani E, et al. Psychometrika. 2021 Mar;86(1):65-95. doi: 10.1007/s11336-021-09751-8. Epub 2021 Mar 26. Psychometrika. 2021. PMID: 33768403 Free PMC article.
Modelling nonlinear moderation effects with local structural equation modelling (LSEM): A non-technical introduction.
Liu T, Ding R, Su Z, Peng Z, Hildebrandt A. Liu T, et al. Int J Psychol. 2025 Feb;60(1):e13259. doi: 10.1002/ijop.13259. Epub 2024 Oct 19. Int J Psychol. 2025. PMID: 39425575 Free PMC article.

See all "Cited by" articles

References

1. Bauer DJ (2017). A more general model for testing measurement invariance and differential item functioning. Psychological Methods, 22, 507–526. - PMC - PubMed
1. Bauer DJ & Hussong AM (2009). Psychometric approaches for developing commensurate measures across independent studies: traditional and new models. Psychological Methods, 14, 101–125. - PMC - PubMed
1. Benjamini Y & Hochberg Y (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, 57, 289–300.
1. Brandt H, Cambria J & Kelava A (in press). An adaptive Bayesian lasso approach with spike-and-slab priors to identify multiple linear and nonlinear effects in structural equation models. Structural Equation Modeling: A Multidisciplinary Journal.
1. Byrne BM, Shavelson RJ, & Muthén B (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105, 456–466.

Grants and funding

R01 DA034636/DA/NIDA NIH HHS/United States

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- figshare - Access datasets and other research materials.

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Simplifying the Assessment of Measurement Invariance over Multiple Background Variables: Using Regularized Moderated Nonlinear Factor Analysis to Detect Differential Item Functioning

Affiliations

Simplifying the Assessment of Measurement Invariance over Multiple Background Variables: Using Regularized Moderated Nonlinear Factor Analysis to Detect Differential Item Functioning

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources