Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Aug 1;25(8):969-975.
doi: 10.1093/jamia/ocy032.

Design and implementation of a standardized framework to generate and evaluate patient-level prediction models using observational healthcare data

Affiliations

Design and implementation of a standardized framework to generate and evaluate patient-level prediction models using observational healthcare data

Jenna M Reps et al. J Am Med Inform Assoc. .

Abstract

Objective: To develop a conceptual prediction model framework containing standardized steps and describe the corresponding open-source software developed to consistently implement the framework across computational environments and observational healthcare databases to enable model sharing and reproducibility.

Methods: Based on existing best practices we propose a 5 step standardized framework for: (1) transparently defining the problem; (2) selecting suitable datasets; (3) constructing variables from the observational data; (4) learning the predictive model; and (5) validating the model performance. We implemented this framework as open-source software utilizing the Observational Medical Outcomes Partnership Common Data Model to enable convenient sharing of models and reproduction of model evaluation across multiple observational datasets. The software implementation contains default covariates and classifiers but the framework enables customization and extension.

Results: As a proof-of-concept, demonstrating the transparency and ease of model dissemination using the software, we developed prediction models for 21 different outcomes within a target population of people suffering from depression across 4 observational databases. All 84 models are available in an accessible online repository to be implemented by anyone with access to an observational database in the Common Data Model format.

Conclusions: The proof-of-concept study illustrates the framework's ability to develop reproducible models that can be readily shared and offers the potential to perform extensive external validation of models, and improve their likelihood of clinical uptake. In future work the framework will be applied to perform an "all-by-all" prediction analysis to assess the observational data prediction domain across numerous target populations, outcomes and time, and risk settings.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Illustration of how the homogeneous structure of the OMOP common data model enables sharing of model development code.
Figure 2.
Figure 2.
Illustration of the prediction problem. Patients enter the target population when they experience the index event (blue rectangle). For each patient, prediction variables are constructed using data recorded prior to the index date, and the presence of the outcome of interest is assessed during the time-at-risk.

References

    1. Collins GS, Mallett S, Omar O et al. . Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting. BMC Med 2011;9:103. - PMC - PubMed
    1. Collins GS, Omar O, Shanyinde M et al. . A systematic review finds prediction models for chronic kidney were poorly reported and often developed using inappropriate methods. J Clin Epidemiol 2013;66:268–277. - PubMed
    1. Collins GS, de Groot JA, Dutton S et al. . External validation of multivariable prediction models: a systematic review of methodological conduct and reporting. BMC Med Res Methodol 2014;141:1. - PMC - PubMed
    1. Goldstein BA, Navar AM, Pencina MJ et al. . Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. J Am Med Inform Assoc 2017;241:198–208. - PMC - PubMed
    1. Collins GS, Reitsma JB, Altman DG et al. . Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMC Med 2015;131:1–9. - PMC - PubMed

Publication types