. 2025 Jan 9;25(1):4.

doi: 10.1186/s12874-025-02457-w.

Identify the underlying true model from other models for clinical practice using model performance measures

Yan Li¹

Affiliations

PMID: 39789439
PMCID: PMC11715858
DOI: 10.1186/s12874-025-02457-w

Identify the underlying true model from other models for clinical practice using model performance measures

Yan Li. BMC Med Res Methodol. 2025.

. 2025 Jan 9;25(1):4.

doi: 10.1186/s12874-025-02457-w.

Author

Yan Li¹

Affiliation

¹ School of Mathematical Sciences, Xiamen University, Xiamen, 361005, People's Republic of China. yan.li2020@outlook.com.

PMID: 39789439
PMCID: PMC11715858
DOI: 10.1186/s12874-025-02457-w

Abstract

Objective: To assess whether the outcome generation true model could be identified from other candidate models for clinical practice with current conventional model performance measures considering various simulation scenarios and a CVD risk prediction as exemplar.

Study design and setting: Thousands of scenarios of true models were used to simulate clinical data, various candidate models and true models were trained on training datasets and then compared on testing datasets with 25 conventional use model performance measures. This consists of univariate simulation (179.2k simulated datasets and over 1.792 million models), multivariate simulation (728k simulated datasets and over 8.736 million models) and a CVD risk prediction case analysis.

Results: True models had overall C statistic and 95% range of 0.67 (0.51, 0.96) across all scenarios in univariate simulation, 0.81 (0.54, 0.98) in multivariate simulation, 0.85 (0.82, 0.88) in univariate case analysis and 0.85 (0.82, 0.88) in multivariate case analysis. Measures showed very clear differences between the true model and flip-coin model, little or none differences between the true model and candidate models with extra noises, relatively small differences between the true model and proxy models missing causal predictors.

Conclusion: The study found the true model is not always identified as the "outperformed" model by current conventional measures for binary outcome, even though such true model is presented in the clinical data. New statistical approaches or measures should be established to identify the casual true model from proxy models, especially for those in proxy models with extra noises and/or missing causal predictors.

Keywords: Cardiovascular disease; Clinical risk prediction model; Model performance measures; Outcome generation true model.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: The research protocol was approved by the University Presidential Scholarship team. Ethical approval was not applied here since this is a methodological simulation and case study aims to improve application of clinical risk prediction model on chronic diseases such as CVD (i.e., to better prevent CVD), and there is no direct involvement of identifiable human subjects in the study. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.

Figures

**Fig. 1**
Boxplot of differences of C statistics from candidate models to the true model in univariate simulations. X axis: type of models. Y axis: Differences of C statistics from candidate models to the true model

**Fig. 2**
Boxplot of differences of C statistics from candidate models to the true model in multivariate simulations. X axis: type of models. Y axis: Differences of C statistics from candidate models to the true model

See this image and copyright information in PMC

References

1. National Clinical Guideline Centre Lipid Modification Cardiovascular Risk Assessment and the Modification of Blood Lipids for the Primary and Secondary Prevention of Cardiovascular Disease Clinical Guideline Methods, Evidence and Recommendations Lipid Modification Contents. https://www.nice.org.uk/guidance/cg181; 2014.
1. Hippisley-Cox J, Coupland C, Brindle P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: prospective cohort study. Bmj. 2017;2099:j2099. - PMC - PubMed
1. Li Y, Sperrin M, Ashcroft DM, Van Staa TP. Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar. BMJ. 2020;371. - PMC - PubMed
1. Collins GS, Reitsma JB, Altman DG, Moons KG. M. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement. Eur Urol. 2015;67:1142–51. - PubMed
1. Wolff RF, et al. A tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med 170. 2019;PROBAST:51–8. - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- BioMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Identify the underlying true model from other models for clinical practice using model performance measures

Affiliation

Identify the underlying true model from other models for clinical practice using model performance measures

Author

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

References

MeSH terms

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

References

MeSH terms

Related information

LinkOut - more resources

Full Text Sources