Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun:5:622-630.
doi: 10.1200/CCI.20.00184.

Clinical Inflection Point Detection on the Basis of EHR Data to Identify Clinical Trial-Ready Patients With Cancer

Affiliations

Clinical Inflection Point Detection on the Basis of EHR Data to Identify Clinical Trial-Ready Patients With Cancer

Kenneth L Kehl et al. JCO Clin Cancer Inform. 2021 Jun.

Abstract

Purpose: To inform precision oncology, methods are needed to use electronic health records (EHRs) to identify patients with cancer who are experiencing clinical inflection points, consistent with worsening prognosis or a high propensity to change treatment, at specific time points. Such patients might benefit from real-time screening for clinical trials.

Methods: Using serial unstructured imaging reports for patients with solid tumors or lymphoma participating in a single-institution precision medicine study, we trained a deep neural network natural language processing (NLP) model to dynamically predict patients' prognoses and propensity to start new palliative-intent systemic therapy within 30 days. Model performance was evaluated using Harrell's c-index (for prognosis) and the area under the receiver operating characteristic curve (AUC; for new treatment and new clinical trial enrollment). Associations between model outputs and manual annotations of cancer progression were also evaluated using the AUC.

Results: A deep NLP model was trained and evaluated using 302,688 imaging reports for 16,780 patients. In a held-out test set of 34,770 reports for 1,952 additional patients, the model predicted survival with a c-index of 0.76 and initiation of new treatment with an AUC of 0.77. Model-generated prognostic scores were associated with annotation of cancer progression on the basis of manual EHR review (n = 1,488 reports for 110 patients with lung or colorectal cancer) with an AUC of 0.78, and predictions of new treatment were associated with annotation of cancer progression on the basis of manual EHR review with an AUC of 0.84.

Conclusion: Training a deep NLP model to identify clinical inflection points among patients with cancer is feasible. This approach could identify patients who may benefit from real-time targeted clinical trial screening interventions at health system scale.

PubMed Disclaimer

Conflict of interest statement

Kenneth L. KehlHonoraria: Roche, IBMConsulting or Advisory Role: Aetion Eva M. LepistoEmployment: KiniksaStock and Other Ownership Interests: Kiniksa Alexander GusevPatents, Royalties, Other Intellectual Property: Pending patent on an immunotherapy biomarker Eliezer M. Van AllenStock and Other Ownership Interests: Syapse, Tango Therapeutics, Genome Medical, Microsoft, Ervaxx, Monte Rosa Therapeutics, Manifold BioConsulting or Advisory Role: Syapse, Roche, Third Rock Ventures, Takeda, Novartis, Genome Medical, Invitae, Illumina, Tango Therapeutics, Ervaxx, Janssen, Monte Rosa Therapeutics, Manifold BioSpeakers' Bureau: IlluminaResearch Funding: Bristol Myers Squibb, NovartisPatents, Royalties, Other Intellectual Property: Patent on discovery of retained intron as source of cancer neoantigens, Patent on discovery of chromatin regulators as biomarkers of response to cancer immunotherapy, and Patent on clinical interpretation algorithms using cancer molecular dataTravel, Accommodations, Expenses: Roche/Genentech Michael J. HassettResearch Funding: IBM Deborah SchragStock and Other Ownership Interests: MerckHonoraria: PfizerConsulting or Advisory Role: JAMA—Journal of the American Medical AssociationResearch Funding: AACR, GRAILPatents, Royalties, Other Intellectual Property: PRISSMM model is trademarked, and curation tools are available to academic medical centers and government under Creative Commons licenseTravel, Accommodations, Expenses: IMEDEX, Precision Medicine World ConferenceOther Relationship: JAMA—Journal of the American Medical AssociationNo other potential conflicts of interest were reported.

Figures

FIG 1.
FIG 1.
Distribution of predicted log hazard of mortality by actual subsequent 6-month mortality (n = 34,770 imaging reports for 1,952 patients). Histogram of model output 1 (predicted log hazard of mortality per year), by actual 6-month survival, following each imaging report, demonstrating that patients with worse predicted prognoses at any given time point have a lower rate of 6-month survival following that time point. Test set, restricted to imaging reports at least 6 months before the cohort censoring date of December 31, 2018.
FIG 2.
FIG 2.
Distribution of predicted log odds of new treatment in 30 days versus actual subsequent treatment in 30 days (n = 34,770 imaging reports for 1,952 patients). Histogram of model output 2 (predicted log odds of new treatment within 30 days), by actual rate of new treatment within 30 days, following each imaging report, demonstrating that patients with higher predicted odds of initiating new treatment after any given time point had a higher actual rate of initiating new treatment. Test set, restricted to imaging reports at least 6 months before the cohort censoring date of December 31, 2018.
FIG 3.
FIG 3.
Distribution of predicted log hazards of mortality versus manually annotated cancer progression among patients with non–small-cell lung cancer or colorectal cancer (n = 1,488 imaging reports for 110 patients). Histogram of model output 1 (predicted log hazards of mortality per year), by manual annotation of the presence or absence of cancer progression on each imaging report, demonstrating that patients with higher predicted log hazards of mortality at any given time point were more likely to have cancer progression manually annotated at that time point. Test set, restricted to imaging reports at least 6 months before the cohort censoring date of December 31, 2018.
FIG 4.
FIG 4.
Distribution of predicted log odds of new treatment within 30 days versus manually annotated cancer progression among patients with non–small-cell lung cancer or colorectal cancer (n = 1,488 imaging reports for 110 patients). Histogram of model output 2 (predicted log odds of new palliative-intent systemic therapy within 30 days), by manual annotation of the presence or absence of cancer progression on each imaging report, demonstrating that patients with a higher predicted log odds of new treatment at any given time point were more likely to have cancer progression manually annotated at that time point. Test set, restricted to imaging reports at least 6 months before the cohort censoring date of December 31, 2018.

References

    1. Sherman RE Anderson SA Dal Pan GJ, et al. : Real-world evidence—What is it and what can it tell us? N Engl J Med 375:2293-2297, 2016 - PubMed
    1. Manz CR Parikh RB Small DS, et al. : Effect of integrating machine learning mortality estimates with behavioral nudges to clinicians on serious illness conversations among patients with cancer: A stepped-wedge cluster randomized clinical trial. JAMA Oncol 6:e204759, 2020 - PMC - PubMed
    1. Garraway LA, Verweij J, Ballman KV: Precision oncology: An overview. J Clin Oncol 31:1803-1805, 2013 - PubMed
    1. Flaherty KT Gray R Chen A, et al. : The molecular analysis for therapy choice (NCI-MATCH) trial: Lessons for genomic trial design. J Natl Cancer Inst 112:1021-1029, 2020 - PMC - PubMed
    1. Meric-Bernstam F Brusco L Shaw K, et al. : Feasibility of large-scale genomic testing to facilitate enrollment onto genomically matched clinical trials. J Clin Oncol 33:2753-2762, 2015 - PMC - PubMed

Publication types