Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun:5:719-727.
doi: 10.1200/CCI.20.00180.

Automated Electronic Health Record-Based Tool for Identification of Patients With Metastatic Disease to Facilitate Clinical Trial Patient Ascertainment

Affiliations

Automated Electronic Health Record-Based Tool for Identification of Patients With Metastatic Disease to Facilitate Clinical Trial Patient Ascertainment

Jeffrey Kirshner et al. JCO Clin Cancer Inform. 2021 Jun.

Abstract

Purpose: To facilitate identification of clinical trial participation candidates, we developed a machine learning tool that automates the determination of a patient's metastatic status, on the basis of unstructured electronic health record (EHR) data.

Methods: This tool scans EHR documents, extracting text snippet features surrounding key words (such as metastatic, progression, and local). A regularized logistic regression model was trained and used to classify patients across five metastatic categories: highly likely and likely positive, highly likely and likely negative, and unknown. Using a real-world oncology database of patients with solid tumors with manually abstracted information as reference, we calculated sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV). We validated the performance in a real-world data set, evaluating accuracy gains upon additional user review of tool's outputs after integration into clinic workflows.

Results: In the training data set (N = 66,532), the model sensitivity and specificity (% [95% CI]) were 82.4 [81.9 to 83.0] and 95.5 [95.3 to 96.7], respectively; the PPV was 89.3 [88.8 to 90.0], and the NPV was 94.0 [93.8 to 94.2]. In the validation sample (n = 200 from five distinct care sites), after user review of model outputs, values increased to 97.1 [85.1 to 99.9] for sensitivity, 98.2 [94.8 to 99.6] for specificity, 91.9 [78.1 to 98.3] for PPV, and 99.4 [96.6 to 100.0] for NPV. The model assigned 163 of 200 patients to the highly likely categories. The error prevalence was 4% before and 2% after user review.

Conclusion: This tool infers metastatic status from unstructured EHR data with high accuracy and high confidence in more than 75% of cases, without requiring additional manual review. By enabling efficient characterization of metastatic status, this tool could mitigate a key barrier for patient ascertainment and clinical trial participation in community clinics.

PubMed Disclaimer

Conflict of interest statement

Kelly CohnConsulting or Advisory Role: Flatiron Health Madeline RicheyEmployment: Flatiron HealthStock and Other Ownership Interests: Roche Peter LarsonEmployment: Flatiron HealthStock and Other Ownership Interests: Roche Lauren SuttonEmployment: Flatiron Health Evelyn SiuEmployment: Flatiron HealthStock and Other Ownership Interests: Roche Pharma AG Janet DoneganEmployment: Flatiron HealthStock and Other Ownership Interests: Reports equity ownership in Flatiron Health Inc: reports stock ownership in Roche Zexi ChenEmployment: Flatiron HealthStock and Other Ownership Interests: Roche Caroline NightingaleEmployment: Flatiron HealthStock and Other Ownership Interests: Roche Melissa EstévezEmployment: Flatiron HealthStock and Other Ownership Interests: Flatiron Health, Roche H. James HamrickEmployment: Flatiron HealthLeadership: Flatiron HealthStock and Other Ownership Interests: RocheNo other potential conflicts of interest were reported.

Publication types

LinkOut - more resources