Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan;13(1):56-66.
doi: 10.1055/s-0041-1740923. Epub 2022 Feb 16.

A Graphical Toolkit for Longitudinal Dataset Maintenance and Predictive Model Training in Health Care

Affiliations

A Graphical Toolkit for Longitudinal Dataset Maintenance and Predictive Model Training in Health Care

Eric Bai et al. Appl Clin Inform. 2022 Jan.

Abstract

Background: Predictive analytic models, including machine learning (ML) models, are increasingly integrated into electronic health record (EHR)-based decision support tools for clinicians. These models have the potential to improve care, but are challenging to internally validate, implement, and maintain over the long term. Principles of ML operations (MLOps) may inform development of infrastructure to support the entire ML lifecycle, from feature selection to long-term model deployment and retraining.

Objectives: This study aimed to present the conceptual prototypes for a novel predictive model management system and to evaluate the acceptability of the system among three groups of end users.

Methods: Based on principles of user-centered software design, human-computer interaction, and ethical design, we created graphical prototypes of a web-based MLOps interface to support the construction, deployment, and maintenance of models using EHR data. To assess the acceptability of the interface, we conducted semistructured user interviews with three groups of users (health informaticians, clinical and data stakeholders, chief information officers) and evaluated preliminary usability using the System Usability Scale (SUS). We subsequently revised prototypes based on user input and developed user case studies.

Results: Our prototypes include design frameworks for feature selection, model training, deployment, long-term maintenance, visualization over time, and cross-functional collaboration. Users were able to complete 71% of prompted tasks without assistance. The average SUS score of the initial prototype was 75.8 out of 100, translating to a percentile range of 70 to 79, a letter grade of B, and an adjective rating of "good." We reviewed persona-based case studies that illustrate functionalities of this novel prototype.

Conclusion: The initial graphical prototypes of this MLOps system are preliminarily usable and demonstrate an unmet need within the clinical informatics landscape.

PubMed Disclaimer

Conflict of interest statement

M.L.R. reports all support from NIGMS (grant no.: U54GM115677) for early phases of this work. S.L.S. reports all support from Brown University Summer Assistantship Fund.

Figures

Fig. 1
Fig. 1
A visual depiction of our standardized machine learning workflow. The intersecting circular shapes highlight the iterative nature of predictive model creation. The outlined badges indicate the major end-user groups that interact at each step. While the health informatician participates in all stages of the workflow, clinical and data stakeholders and the CIO participate in selecting suitable datasets for ingestion, constructing variables, and evaluating model performance over time. API, application programming interface; CIO, chief information officer; RESTful, representational state transfer.
Fig. 2
Fig. 2
Design overview schematic. An overview of the version 2 of our proposed designs after incorporating user feedback. Each tile represents a major section of the design. The directional arrows represent our hypothesized workflow, starting from set-up of the data analysis pipeline to tracking model performance on the dashboard to launching high-performing models via the API. The labels correspond to the specific design document that provides an in-depth overview of that specific functionality. API, application programming interface.
Fig. 3
Fig. 3
Design walkthrough, specifying data schema and definitions. Walkthrough of specifying the data schema during data ingestion and input and outcome definitions; ( A ) specifying rules for mapping CSV file names to SQL table names; ( B ) specifying rules for mapping CSV column names to SQL column names; ( C ) overview of all inputs and outcomes, prior versions tracked and accessible via menu; ( D ) SQL editor for specifying definitions with requirements list and query output preview. CSV, comma separated value.
Fig. 4
Fig. 4
Design walkthrough, configuring models. Walkthrough of configuring models; ( A ) overview of all active models showing prior runs; ( B ) create a new model by specifying inputs, outcomes, and language environment; ( C ) dedicated code editor for each step of defining a model with ability to add additional project files and specify bash scripts to run before or after function execution
Fig. 5
Fig. 5
Design walkthrough, tracking performance and deploying models. Walkthrough of tracking performance and deploying models; ( A ) compare aggregated performance across different outcomes, keep track of ongoing tasks; ( B ) compare model performance for a single outcome; ( C ) configure RESTful API endpoints to allow authorized applications to access trained models; ( D ) send automatically generated documentation to developers of authorized applications. API, application programming interface; RESTful, representational state transfer.
Fig. 6
Fig. 6
Healthcare-specific design concerns. Feature designs for healthcare-specific data security and privacy concerns; ( A ) track all events across all users, configure notifications for key events; ( B ) manage access permissions, set password policies, and check server security status; ( C ) flag possible protected health information leaks for remediation; ( D ) create API access keys and manage access requests. API, application programming interface.

References

    1. Kruse C S, Goswamy R, Raval Y, Marawi S. Challenges and opportunities of big data in health care: a systematic review. JMIR Med Inform. 2016;4(04):e38. - PMC - PubMed
    1. Roski J, Bo-Linn G W, Andrews T A. Creating value in health care through big data: opportunities and policy implications. Health Aff (Millwood) 2014;33(07):1115–1122. - PubMed
    1. Ngiam K Y, Khor I W. Big data and machine learning algorithms for health-care delivery. Lancet Oncol. 2019;20(05):e262–e273. - PubMed
    1. Rojas J C, Carey K A, Edelson D P, Venable L R, Howell M D, Churpek M M.Predicting intensive care unit readmission with machine learning using electronic health record dataAnn Am Thorac Soc 2018;15(07): - PMC - PubMed
    1. Kansagara D, Englander H, Salanitro A. Risk prediction models for hospital readmission: a systematic review. JAMA. 2011;306(15):1688–1698. - PMC - PubMed

Publication types