Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr;14(4):621-639.
doi: 10.1002/psp4.13306. Epub 2025 Jan 20.

Covariate Model Selection Approaches for Population Pharmacokinetics: A Systematic Review of Existing Methods, From SCM to AI

Affiliations

Covariate Model Selection Approaches for Population Pharmacokinetics: A Systematic Review of Existing Methods, From SCM to AI

Mélanie Karlsen et al. CPT Pharmacometrics Syst Pharmacol. 2025 Apr.

Abstract

A growing number of covariate modeling methods have been proposed in the field of popPK modeling, but limited information exists on how they all compare. The objective of this study was to perform a systematic review of all popPK covariate modeling methods, focusing on assessing the existing knowledge on their performances. For each method of each article included in this review, evaluation setting, performance metrics along with their associated values, and relative computational times were reported when available. Evaluation settings report was done for uncertainty assessment of communicated results. Results showed that EBEs-based ML methods stood out as the best covariate selection methods. AALASSO, a hybrid genetic algorithm, FREM with a clinical significance criterion and SCM+ with stagewise filtering were the best covariate model selection techniques-AALASSO being the very best one. Results also showed a lack of consensus on how to benchmark simulated datasets of different scenarios when evaluating method performances, but also on which metrics to use for method evaluation. We propose to systematically report TPR (sensitivity), FPR (Type I error), FNR (Type II error), TNR (specificity), covariate parameter error bias (MPE) and precision (RMSE), clinical relevance, and model fitness by means of BIC, concentration prediction error bias (MPE), and precision (RMSE) of new proposed methods and compare them with SCM. We propose to systematically combine covariate selection techniques to SCM or FFEM to allow for comparison with SCM. We also highlight the need for an open-source benchmark of simulated datasets on a representative set of scenarios.

Keywords: artificial intelligence; covariate model building; covariate modeling; covariate screening; machine learning; pharmacometrics; population pharmacokinetic.

PubMed Disclaimer

Conflict of interest statement

M. Karlsen, D. Fabre, D. Marchionni, and E. Calvier are Sanofi employees and may hold shares and/or stock options in the company. All other authors declared no competing interests for this work.

Figures

FIGURE 1
FIGURE 1
Procedure for building the database of articles. n refers to the number of articles when relevant.
FIGURE 2
FIGURE 2
Methods handled in the database of articles according to their category and objective. Methods abbreviations: AALASSO, Adjusted Adaptive Least Absolute Shrinkage and Selection Operator; COSSAC, COnditional Sampling use for Stepwise Approach based on Correlation tests; FFEM, Full Fixed Effects Model; FREM, Full Random Effects Model; GA1, Genetic Algorithm developed by Ismail et al.; GA2, Genetic Algorithm developed by Ronchi et al.; GAM, Generalized Additive Models; GEP, Gene Expression Programming; H‐GA‐ML, Hybrid‐Genetic Algorithm‐Machine Learning; H‐WAM‐BE, Hybrid WAM with Backward Elimination; LASSO, Least Absolute Shrinkage and Selection Operator; MARS, Multivariate Adaptive Regression Splines; ML, Machine Learning includes: (regularized) (stepwise) linear regression, random forests, neural networks, extreme gradient boosting, support vector machines. REG, Regression; SAMBA, Stochastic Approximation for Model Building Algorithm; SCM, Stepwise covariate model; SCM+ with SF, Stepwise covariate model+ with Stage‐wise Flitering; SHAP, SHapley Additive exPlanations; WAM, Wald's Approximation Method.
FIGURE 3
FIGURE 3
Overall reported comparative performances across methods. The dashed lines separate groups of methods having reportedly equivalent performance regarding their objective. Blue line: Comparison performed only simulated data. Mauve line: Comparison performed only on real data. Beige line: Comparison performed on both real and simulated data. Multiple colors: The comparison has been made across articles using different evaluation settings. Width of the line = the number of scenarios (whether real or simulated) investigated for the comparison (the larger the width, the more scenarios were investigated for the corresponding comparison). In case multiple articles made the same comparison, the maximum number of scenarios investigated across those articles was retrieved. There are two occurrences of LASSO in the figure, because one comparison of LASSO to SCM [23] contradicts all the others. Boxes are colored according to the methods objective. Light pink: Method performing covariate selection. Pink: Method performing covariate model selection. Purple: Method performing covariate model selection and other tasks. Orange: Alternative methods. Methods abbreviations: AALASSO, Adjusted Adaptive Least Absolute Shrinkage and Selection Operator; COSSAC, COnditional Sampling use for Stepwise Approach based on Correlation tests; FFEM, Full Fixed Effects Model; FREM, Full Random Effects Model; GA1, Genetic Algorithm developed by Ismail et al.; GA2, Genetic Algorithm developed by Ronchi et al.; GAM, Generalized Additive Models; GEP, Gene Expression Programming; H‐GA‐ML, Hybrid–Genetic Algorithm–Machine Learning; H‐WAM‐BE, Hybrid WAM with Backward Elimination; LASSO, Least Absolute Shrinkage and Selection Operator; MARS, Multivariate Adaptive Regression Splines; ML, Machine Learning. Includes: (regularized) (stepwise) linear regression, random forests, neural networks, extreme gradient boosting, support vector machines; REG, Regression; SAMBA, Stochastic Approximation for Model Building Algorithm; SCM, Stepwise covariate model; SCM+ with SF, Stepwise covariate model+ with Stage‐wise Flitering; SHAP, SHapley Additive exPlanations; WAM, Wald's Approximation Method.
FIGURE 4
FIGURE 4
Overall reported comparative computational speed across methods. Blue line: Comparison performed only simulated data. Mauve line: Comparison performed only on real data. Beige line: Comparison performed on both real and simulated data. Multiple colors: The comparison has been made across articles using different evaluation settings. Width of the line = the number of scenarios (whether real or simulated) investigated for the comparison (the larger the width, the more scenarios were investigated for the corresponding comparison). In case several articles made the same comparison, the maximum number of scenarios investigated across those articles was retrieved. Black box = no magnitude on the difference in computational speed was reported. Boxes are colored according to the methods objective. Light pink: Method performing covariate selection. Pink: Method performing covariate model selection. Purple: Method performing covariate model selection and other tasks. Orange: Alternative methods. Methods abbreviations: AALASSO, Adjusted Adaptive Least Absolute Shrinkage and Selection Operator; COSSAC, COnditional Sampling use for Stepwise Approach based on Correlation tests; FFEM, Full Fixed Effects Model; FREM, Full Random Effects Model; GA1, Genetic Algorithm developed by Ismail et al.; GA2, Genetic Algorithm developed by Ronchi et al.; GAM, Generalized Additive Models; GEP, Gene Expression Programming; H‐GA‐ML, Hybrid–Genetic Algorithm–Machine Learning; H‐WAM‐BE, Hybrid WAM with Backward Elimination; LASSO, Least Absolute Shrinkage and Selection Operator; MARS, Multivariate Adaptive Regression Splines; ML, Machine Learning. Includes: (regularized) (stepwise) linear regression, random forests, neural networks, extreme gradient boosting, support vector machines; REG, Regression; SAMBA, Stochastic Approximation for Model Building Algorithm; SCM, Stepwise covariate model; SCM+ with SF, Stepwise covariate model+ with Stage‐wise Flitering; SHAP, SHapley Additive exPlanations; WAM, Wald's Approximation Method.

References

    1. Page M. J., McKenzie J. E., Bossuyt P. M., et al., “The PRISMA 2020 Statement: An Updated Guideline for Reporting Systematic Reviews,” BMJ 372 (2021): n71. - PMC - PubMed
    1. Ribbing J., Nyberg J., Caster O., and Jonsson E. N., “The Lasso—A Novel Method for Predictive Covariate Model Building in Nonlinear Mixed Effects Models,” Journal of Pharmacokinetics and Pharmacodynamics 34 (2007): 485–517. - PubMed
    1. Philipp M., Buatois S., Retout S., and Mentré F., “Impact of Covariate Model Building Methods on Their Clinical Relevance Evaluation in Population Pharmacokinetic Analyses: Comparison of the Full Model, Stepwise Covariate Model (SCM) and SCM+ Approaches,” Journal of Pharmacokinetics and Pharmacodynamics 51 (2024): 653–670. - PubMed
    1. Jonsson E. N. and Karlsson M. O., “Automated Covariate Model Building Within NONMEM,” Pharmaceutical Research 15 (1998): 1463–1468. - PubMed
    1. Ayral G., Abdallah J.‐F. S., Magnard C., and Chauvin J., “A Novel Method Based on Unbiased Correlations Tests for Covariate Selection in Nonlinear Mixed Effects Models: The COSSAC Approach,” CPT: Pharmacometrics & Systems Pharmacology 10 (2021): 318–329. - PMC - PubMed

Publication types

LinkOut - more resources