Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 31;90(3):1-38.
doi: 10.1017/psy.2025.10032. Online ahead of print.

Explaining Person-by-Item Responses using Person- and Item-Level Predictors via Random Forests and Interpretable Machine Learning in Explanatory Item Response Models

Affiliations

Explaining Person-by-Item Responses using Person- and Item-Level Predictors via Random Forests and Interpretable Machine Learning in Explanatory Item Response Models

Sun-Joo Cho et al. Psychometrika. .

Abstract

This study incorporates a random forest (RF) approach to probe complex interactions and nonlinearity among predictors into an item response model with the goal of using a hybrid approach to outperform either an RF or explanatory item response model (EIRM) only in explaining item responses. In the specified model, called EIRM-RF, predicted values using RF are added as a predictor in EIRM to model the nonlinear and interaction effects of person- and item-level predictors in person-by-item response data, while accounting for random effects over persons and items. The results of the EIRM-RF are probed with interpretable machine learning (ML) methods, including feature importance measures, partial dependence plots, accumulated local effect plots, and the H-statistic. The EIRM-RF and the interpretable methods are illustrated using an empirical data set to explain differences in reading comprehension in digital versus paper mediums, and the results of EIRM-RF are compared with those of EIRM and RF to show empirical differences in modeling the effects of predictors and random effects among EIRM, RF, and EIRM-RF. In addition, simulation studies are conducted to compare model accuracy among the three models and to evaluate the performance of interpretable ML methods.

Keywords: explanatory item response theory; interpretable machine learning; mixed-effects machine learning; random forests.

PubMed Disclaimer

Conflict of interest statement

The authors have no competing interests to declare that are relevant to the content of this article.

Figures

Figure 1
Figure 1
Empirical study: feature importance measures for EIRM-RF. Note: In the predictor (feature) name, “nom” indicates a nominal predictor, and “c” indicates the mean-centered predictor.
Figure 2
Figure 2
Empirical study: partial dependence plots from EIRM-RF. Note: Continuous predictors were mean-centered; each tick mark in the x-axis represents values of a continuous predictor.
Figure 3
Figure 3
Empirical study: partial dependence plots from EIRM-RF. Note: Continuous predictors were mean-centered; each tick mark in the x-axis represents values of a continuous predictor.
Figure 4
Figure 4
Empirical study: partial dependence plots of EIRM-RF. Note: Continuous predictors were mean-centered; each tick mark in the x-axis represents values of a continuous predictor.
Figure 5
Figure 5
Empirical study: accumulated local plots of EIRM-RF. Note: Continuous predictors were mean-centered; each tick mark in the x-axis represents values of a continuous predictor.
Figure 6
Figure 6
Empirical study: the H-statistic of all predictors from EIRM-RF. Note: In the predictor (feature) name, “nom” indicates a nominal predictor, and “c” indicates the mean-centered predictor; Values on the x-axis indicate the strengths of the interaction effects.
Figure 7
Figure 7
Empirical study: the H-statistic of the two-way interactions between the item format and all other predictors from EIRM-RF. Note: In the predictor (feature) name, “nom” indicates a nominal predictor, and “c” indicates the mean-centered predictor; Values on the x-axis indicate the strengths of the interaction effects.
Figure A.1
Figure A.1
Visualization of a generated tree structure.
Figure A2
Figure A2
Selected plots of ‘true’ partial dependence (top) and accumulated local effects (bottom). Note: Continuous predictors were mean-centered; To present patterns in partial dependence and accumulated local effects, the x- and y-axes of the two selected predictors were not displayed on the same scale.
Figure A3
Figure A3
Visualization of a generated tree structure.
Figure A4
Figure A4
Selected plots of “true” partial dependence (top) and accumulated local effects (bottom). Note: Continuous predictors were mean-centered; To present patterns in partial dependence and accumulated local effects, the x- and y-axes of the two selected predictors were not displayed on the same scale.

References

    1. Apley, D. W. , & Zhu, J. (2020). Visualizing the effects of predictor variables in black box supervised learning models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 82, 1059–1086. 10.1111/rssb.12377 - DOI
    1. Baayen, R. H. , Davidson, D. J. , & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59, 390–412. 10.1016/j.jml.2007.12.005 - DOI
    1. Bates, D. , Mächler, M. , Bolker, B. , & Walker, S. (2015). Fitting linear mixed-effects models Using lme4. Journal of Statistical Software, 67, 1–48. 10.18637/jss.v067.i01. - DOI
    1. Ben-Yehudah, G. , & Eshet-Alkalai, Y. (2018). The contribution of text-highlighting to comprehension: A comparison of print and digital reading. Journal of Educational Multimedia and Hypermedia, 27, 153–178.
    1. Bolsinova, M. , & Molenaar, D. (2018). Modeling nonlinear conditional dependence between response time and accuracy. Frontiers in Psychology, 9, 12. 10.3389/fpsyg.2018.01525 - DOI - PMC - PubMed

LinkOut - more resources