Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 5;14(1):26783.
doi: 10.1038/s41598-024-77535-y.

Development and validation of a novel AI framework using NLP with LLM integration for relevant clinical data extraction through automated chart review

Affiliations

Development and validation of a novel AI framework using NLP with LLM integration for relevant clinical data extraction through automated chart review

Mert Marcel Dagli et al. Sci Rep. .

Abstract

The accurate extraction of surgical data from electronic health records (EHRs), particularly operative notes through manual chart review (MCR), is complex, crucial, and time-intensive, limited by human error due to fatigue and the level of training. This study aimed to develop and validate a novel Natural Language Processing (NLP) algorithm integrated with a Large Language Model (LLM; GPT4-Turbo) to automate the extraction of spinal surgery data from EHRs. The algorithm employed a two-stage approach. Initially, a rule-based NLP framework reviewed and classified candidate segments from the text, preserving their reference segments. These segments were then verified in the second stage through the LLM. The primary outcomes of this study were the accurate extraction of surgical data, including the type of surgery, levels operated, number of disks removed, and presence of intraoperative incidental durotomies. Secondary objectives explored time efficiency, tokenization lengths, and costs. The performance of the algorithm was assessed across two validation databases, analyzing metrics such as accuracy, sensitivity, discrimination, F1-score, and precision, with 95% confidence intervals calculated using percentile-based bootstrapping. The NLP + LLM algorithm markedly outperformed all performance metrics, demonstrating significant improvements in time and cost efficiency. These results suggest the potential for widespread adoption of this technology.

Keywords: Artificial intelligence; Data science; Electronic health records; Humans; Natural language processing; Quality improvement.

PubMed Disclaimer

Conflict of interest statement

An intellectual property disclosure has been made in accordance with the University of Pennsylvania’s institutional policies and contracts, without any relation to the present work at the time of submission. The Penn Center for Innovation has opted to proceed with copyright protection and is currently evaluating the patentability and potential financial interests in line with their standard procedures.

Figures

Fig. 1
Fig. 1
Area Under the Receiver Operating Characteristic Curve (AUC-ROC) curves for the worst replicate of the natural language processing algorithm with large language model integration (NLP + LLM; A), natural language processing algorithm alone (NLP-only; B), and research year medical student (Medical Student; C). 95% Confidence intervals (95% CI) were calculated with percentile-based bootstrapping with 10,000 resamples.
Fig. 2
Fig. 2
Comparison of mean tokenization lengths of reference text in input prompts per record using natural language processing with large language model integration (NLP + LLM) versus full-length operative notes in input prompts not restricted to candidates (Full Entry) for diskectomy classifications (A) and intraoperative incidental durotomy classifications (B) during validation.

References

    1. Baumann, L. A., Baker, J. & Elshaug, A. G. The impact of electronic health record systems on clinical documentation times: a systematic review. Health Polic. 122, 827–836 (2018). - PubMed
    1. Häyrinen, K., Saranto, K. & Nykänen, P. Definition, structure, content, use and impacts of electronic health records: a review of the research literature. Int. J. Med. Inform. 77, 291–304 (2008). - PubMed
    1. Overhage, J. M. & McCallie, D. Jr Physician time spent using the electronic health record during outpatient encounters: a descriptive study. Ann. Intern. Med. 172, 169–174 (2020). - PubMed
    1. Poissant, L., Pereira, J., Tamblyn, R. & Kawasumi, Y. The impact of electronic health records on time efficiency of physicians and nurses: a systematic review. J. Am. Med. Inform. Assoc. 12, 505–516 (2005). - PMC - PubMed
    1. Cowie, M. R. et al. Electronic health records to facilitate clinical research. Clin. Res. Cardiol. 106, 1–9 (2017). - PMC - PubMed

Publication types

LinkOut - more resources