Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 14;4(5):e0000755.
doi: 10.1371/journal.pdig.0000755. eCollection 2025 May.

From manual clinical criteria to machine learning algorithms: Comparing outcome endpoints derived from diverse electronic health record data modalities

Affiliations

From manual clinical criteria to machine learning algorithms: Comparing outcome endpoints derived from diverse electronic health record data modalities

Shreya Chappidi et al. PLOS Digit Health. .

Abstract

Background: Progression free survival (PFS) is a critical clinical outcome endpoint during cancer management and treatment evaluation. Yet, PFS is often missing from publicly available datasets due to the current subjective, expert, and time-intensive nature of generating PFS metrics. Given emerging research in multi-modal machine learning (ML), we explored the benefits and challenges associated with mining different electronic health record (EHR) data modalities and automating extraction of PFS metrics via ML algorithms.

Methods: We analyzed EHR data from 92 pathology-proven GBM patients, obtaining 233 corticosteroid prescriptions, 2080 radiology reports, and 743 brain MRI scans. Three methods were developed to derive clinical PFS: 1) frequency analysis of corticosteroid prescriptions, 2) natural language processing (NLP) of reports, and 3) computer vision (CV) volumetric analysis of imaging. Outputs from these methods were compared to manually annotated clinical guideline PFS metrics.

Results: Employing data-driven methods, standalone progression rates were 63% (prescription), 78% (NLP), and 54% (CV), compared to the 99% progression rate from manually applied clinical guidelines using integrated data sources. The prescription method identified progression an average of 5.2 months later than the clinical standard, while the CV and NLP algorithms identified progression earlier by 2.6 and 6.9 months, respectively. While lesion growth is a clinical guideline progression indicator, only half of patients exhibited increasing contrast-enhancing tumor volumes during scan-based CV analysis.

Conclusion: Our results indicate that data-driven algorithms can extract tumor progression outcomes from existing EHR data. However, ML methods are subject to varying availability bias, supporting contextual information, and pre-processing resource burdens that influence the extracted PFS endpoint distributions. Our scan-based CV results also suggest that the automation of clinical criteria may not align with human intuition. Our findings indicate a need for improved data source integration, validation, and revisiting of clinical criteria in parallel to multi-modal ML algorithm development.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Sample cancer patient treatment timeline with data generated and captured within the EHR.
Fig 2
Fig 2. Overall patient cohort with overlapping data source availability.
Fig 3
Fig 3. Paradigm for manual and automated methods to derive progression free survival.
Fig 4
Fig 4. a) Boxplot and b) scatterplot distributions of manual and data-derived progression free survival dates.
The dark green line represents the clinical standard PFS dates, with points falling above the dark green line indicating that the automated method derived an earlier PFS date compared to the clinical standard and points falling below indicating that the method derived a later PFS date. The light blue, dark blue, and light green trendlines reflect the Ordinary Least Squares linear regression for the radiology report, MRI scan, and prescription methods, respectively.
Fig 5
Fig 5. Patient timelines and progression results for available a) steroid prescriptions, b) radiology reports, and c) brain MRI scans.
c) Red and blue points indicate scans with a relative increase and decrease, respectively, in contrast-enhancing tumor volumes compared to the baseline post-surgery, pre-RT scan.
Fig 6
Fig 6. Progression-indicating datapoints for studied patient cohort.

References

    1. Mohammed S, Dinesan M, Ajayakumar T. Survival and quality of life analysis in glioblastoma multiforme with adjuvant chemoradiotherapy: a retrospective study. Rep Pract Oncol Radiother. 2022;27(6):1026–36. doi: 10.5603/RPOR.a2022.0113 - DOI - PMC - PubMed
    1. Wen PY, Macdonald DR, Reardon DA, Cloughesy TF, Sorensen AG, Galanis E, et al.. Updated response assessment criteria for high-grade gliomas: response assessment in neuro-oncology working group. J Clin Oncol. 2010;28(11):1963–72. doi: 10.1200/JCO.2009.26.3541 - DOI - PubMed
    1. Henriksen OM, Del Mar Álvarez-Torres M, Figueiredo P, Hangel G, Keil VC, Nechifor RE, et al.. High-grade glioma treatment response monitoring biomarkers: a position statement on the evidence supporting the use of advanced MRI techniques in the clinic, the latest bench-to-bedside developments. part 1: perfusion and diffusion techniques. Front Oncol. 2022;12:810263. doi: 10.3389/fonc.2022.810263 - DOI - PMC - PubMed
    1. Le Fèvre C, Lhermitte B, Ahle G, Chambrelant I, Cebula H, Antoni D, et al.. Pseudoprogression versus true progression in glioblastoma patients: a multiapproach literature review: Part 1 - Molecular, morphological and clinical features. Crit Rev Oncol Hematol. 2021;157:103188. doi: 10.1016/j.critrevonc.2020.103188 - DOI - PubMed
    1. Le Fèvre C, Constans J-M, Chambrelant I, Antoni D, Bund C, Leroy-Freschini B, et al.. Pseudoprogression versus true progression in glioblastoma patients: a multiapproach literature review. Part 2 - Radiological features and metric markers. Crit Rev Oncol Hematol. 2021;159:103230. doi: 10.1016/j.critrevonc.2021.103230 - DOI - PubMed

LinkOut - more resources