This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

[Preprint]. 2024 May 13:2024.01.18.24301502.

doi: 10.1101/2024.01.18.24301502.

SEETrials: Leveraging Large Language Models for Safety and Efficacy Extraction in Oncology Clinical Trials

Kyeryoung Lee, Hunki Paek, Liang-Chin Huang, C Beau Hilton, Surabhi Datta, Josh Higashi, Nneka Ofoegbu, Jingqi Wang, Samuel M Rubinstein, Andrew J Cowan, Mary Kwok, Jeremy L Warner, Hua Xu, Xiaoyan Wang

PMID: 38798420
PMCID: PMC11118548
DOI: 10.1101/2024.01.18.24301502

SEETrials: Leveraging Large Language Models for Safety and Efficacy Extraction in Oncology Clinical Trials

Kyeryoung Lee et al. medRxiv. 2024.

[Preprint]. 2024 May 13:2024.01.18.24301502.

doi: 10.1101/2024.01.18.24301502.

Authors

PMID: 38798420
PMCID: PMC11118548
DOI: 10.1101/2024.01.18.24301502

Update in

SEETrials: Leveraging large language models for safety and efficacy extraction in oncology clinical trials.
Lee K, Paek H, Huang LC, Hilton CB, Datta S, Higashi J, Ofoegbu N, Wang J, Rubinstein SM, Cowan AJ, Kwok M, Warner JL, Xu H, Wang X. Lee K, et al. Inform Med Unlocked. 2024;50:101589. doi: 10.1016/j.imu.2024.101589. Epub 2024 Oct 11. Inform Med Unlocked. 2024. PMID: 39493413 Free PMC article.

Abstract

Background: Initial insights into oncology clinical trial outcomes are often gleaned manually from conference abstracts. We aimed to develop an automated system to extract safety and efficacy information from study abstracts with high precision and fine granularity, transforming them into computable data for timely clinical decision-making.

Methods: We collected clinical trial abstracts from key conferences and PubMed (2012-2023). The SEETrials system was developed with four modules: preprocessing, prompt modeling, knowledge ingestion and postprocessing. We evaluated the system's performance qualitatively and quantitatively and assessed its generalizability across different cancer types- multiple myeloma (MM), breast, lung, lymphoma, and leukemia. Furthermore, the efficacy and safety of innovative therapies, including CAR-T, bispecific antibodies, and antibody-drug conjugates (ADC), in MM were analyzed across a large scale of clinical trial studies.

Results: SEETrials achieved high precision (0.958), recall (sensitivity) (0.944), and F1 score (0.951) across 70 data elements present in the MM trial studies Generalizability tests on four additional cancers yielded precision, recall, and F1 scores within the 0.966-0.986 range. Variation in the distribution of safety and efficacy-related entities was observed across diverse therapies, with certain adverse events more common in specific treatments. Comparative performance analysis using overall response rate (ORR) and complete response (CR) highlighted differences among therapies: CAR-T (ORR: 88%, 95% CI: 84-92%; CR: 95%, 95% CI: 53-66%), bispecific antibodies (ORR: 64%, 95% CI: 55-73%; CR: 27%, 95% CI: 16-37%), and ADC (ORR: 51%, 95% CI: 37-65%; CR: 26%, 95% CI: 1-51%). Notable study heterogeneity was identified (>75% I ² heterogeneity index scores) across several outcome entities analyzed within therapy subgroups.

Conclusion: SEETrials demonstrated highly accurate data extraction and versatility across different therapeutics and various cancer domains. Its automated processing of large datasets facilitates nuanced data comparisons, promoting the swift and effective dissemination of clinical insights.

PubMed Disclaimer

Publication types

Actions

LinkOut - more resources

Full Text Sources
- Cold Spring Harbor Laboratory
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

This is a preprint.

SEETrials: Leveraging Large Language Models for Safety and Efficacy Extraction in Oncology Clinical Trials

SEETrials: Leveraging Large Language Models for Safety and Efficacy Extraction in Oncology Clinical Trials

Authors

Update in

Abstract

Publication types

LinkOut - more resources

Full Text Sources