Deep Learning Approach for Negation and Speculation Detection for Automated Important Finding Flagging and Extraction in Radiology Report: Internal Validation and Technique Comparison Study
- PMID: 37097731
- PMCID: PMC10170361
- DOI: 10.2196/46348
Deep Learning Approach for Negation and Speculation Detection for Automated Important Finding Flagging and Extraction in Radiology Report: Internal Validation and Technique Comparison Study
Abstract
Background: Negation and speculation unrelated to abnormal findings can lead to false-positive alarms for automatic radiology report highlighting or flagging by laboratory information systems.
Objective: This internal validation study evaluated the performance of natural language processing methods (NegEx, NegBio, NegBERT, and transformers).
Methods: We annotated all negative and speculative statements unrelated to abnormal findings in reports. In experiment 1, we fine-tuned several transformer models (ALBERT [A Lite Bidirectional Encoder Representations from Transformers], BERT [Bidirectional Encoder Representations from Transformers], DeBERTa [Decoding-Enhanced BERT With Disentangled Attention], DistilBERT [Distilled version of BERT], ELECTRA [Efficiently Learning an Encoder That Classifies Token Replacements Accurately], ERNIE [Enhanced Representation through Knowledge Integration], RoBERTa [Robustly Optimized BERT Pretraining Approach], SpanBERT, and XLNet) and compared their performance using precision, recall, accuracy, and F1-scores. In experiment 2, we compared the best model from experiment 1 with 3 established negation and speculation-detection algorithms (NegEx, NegBio, and NegBERT).
Results: Our study collected 6000 radiology reports from 3 branches of the Chi Mei Hospital, covering multiple imaging modalities and body parts. A total of 15.01% (105,755/704,512) of words and 39.45% (4529/11,480) of important diagnostic keywords occurred in negative or speculative statements unrelated to abnormal findings. In experiment 1, all models achieved an accuracy of >0.98 and F1-score of >0.90 on the test data set. ALBERT exhibited the best performance (accuracy=0.991; F1-score=0.958). In experiment 2, ALBERT outperformed the optimized NegEx, NegBio, and NegBERT methods in terms of overall performance (accuracy=0.996; F1-score=0.991), in the prediction of whether diagnostic keywords occur in speculative statements unrelated to abnormal findings, and in the improvement of the performance of keyword extraction (accuracy=0.996; F1-score=0.997).
Conclusions: The ALBERT deep learning method showed the best performance. Our results represent a significant advancement in the clinical applications of computer-aided notification systems.
Keywords: BERT; Bidirectional Encoder Representations from Transformers; clinical application; deep learning; natural language processing; negation; radiology; radiology report; supervised learning; transfer learning; validation study.
©Kung-Hsun Weng, Chung-Feng Liu, Chia-Jung Chen. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 25.04.2023.
Conflict of interest statement
Conflicts of Interest: None declared.
Figures



Similar articles
-
Multi-Label Classification in Patient-Doctor Dialogues With the RoBERTa-WWM-ext + CNN (Robustly Optimized Bidirectional Encoder Representations From Transformers Pretraining Approach With Whole Word Masking Extended Combining a Convolutional Neural Network) Model: Named Entity Study.JMIR Med Inform. 2022 Apr 21;10(4):e35606. doi: 10.2196/35606. JMIR Med Inform. 2022. PMID: 35451969 Free PMC article.
-
Extracting Pulmonary Nodules and Nodule Characteristics from Radiology Reports of Lung Cancer Screening Patients Using Transformer Models.J Healthc Inform Res. 2024 May 17;8(3):463-477. doi: 10.1007/s41666-024-00166-5. eCollection 2024 Sep. J Healthc Inform Res. 2024. PMID: 39131104 Free PMC article.
-
RadBERT: Adapting Transformer-based Language Models to Radiology.Radiol Artif Intell. 2022 Jun 15;4(4):e210258. doi: 10.1148/ryai.210258. eCollection 2022 Jul. Radiol Artif Intell. 2022. PMID: 35923376 Free PMC article.
-
Automated labelling of radiology reports using natural language processing: Comparison of traditional and newer methods.Health Care Sci. 2023 Apr 24;2(2):120-128. doi: 10.1002/hcs2.40. eCollection 2023 Apr. Health Care Sci. 2023. PMID: 38938764 Free PMC article. Review.
-
Evaluation of a prototype machine learning tool to semi-automate data extraction for systematic literature reviews.Syst Rev. 2023 Oct 6;12(1):187. doi: 10.1186/s13643-023-02351-w. Syst Rev. 2023. PMID: 37803451 Free PMC article.
Cited by
-
Year 2023 in Biomedical Natural Language Processing: a Tribute to Large Language Models and Generative AI.Yearb Med Inform. 2024 Aug;33(1):241-248. doi: 10.1055/s-0044-1800751. Epub 2025 Apr 8. Yearb Med Inform. 2024. PMID: 40199311 Free PMC article.
-
A Large Language Model to Detect Negated Expressions in Radiology Reports.J Imaging Inform Med. 2025 Jun;38(3):1297-1303. doi: 10.1007/s10278-024-01274-9. Epub 2024 Sep 25. J Imaging Inform Med. 2025. PMID: 39322813 Free PMC article.
-
In-Context Learning with Large Language Models: A Simple and Effective Approach to Improve Radiology Report Labeling.Healthc Inform Res. 2025 Jul;31(3):295-309. doi: 10.4258/hir.2025.31.3.295. Epub 2025 Jul 31. Healthc Inform Res. 2025. PMID: 40840937 Free PMC article.
References
-
- Lacson R, Prevedello LM, Andriole KP, O'Connor SD, Roy C, Gandhi T, Dalal AK, Sato L, Khorasani R. Four-year impact of an alert notification system on closed-loop communication of critical test results. AJR Am J Roentgenol. 2014 Dec;203(5):933–8. doi: 10.2214/AJR.14.13064. https://europepmc.org/abstract/MED/25341129 - DOI - PMC - PubMed
-
- Ignácio FC, de Souza LR, D'Ippolito G, Garcia MM. Radiology report: what is the opinion of the referring physician? Radiol Bras. 2018 Sep;51(5):308–12. doi: 10.1590/0100-3984.2017.0115. https://europepmc.org/abstract/MED/30369658 - DOI - PMC - PubMed
-
- Reda AS, Hashem DA, Khashoggi K, Abukhodair F. Clinicians' behavior toward radiology reports: a cross-sectional study. Cureus. 2020 Nov 05;12(11):e11336. doi: 10.7759/cureus.11336. https://europepmc.org/abstract/MED/33304672 - DOI - PMC - PubMed
-
- European Society of Radiology (ESR) ESR guidelines for the communication of urgent and unexpected findings. Insights Imaging. 2012 Feb;3(1):1–3. doi: 10.1007/s13244-011-0135-y. https://europepmc.org/abstract/MED/22695992 - DOI - PMC - PubMed
-
- Nakamura Y, Hanaoka S, Nomura Y, Nakao T, Miki S, Watadani T, Yoshikawa T, Hayashi N, Abe O. Automatic detection of actionable radiology reports using bidirectional encoder representations from transformers. BMC Med Inform Decis Mak. 2021 Sep 11;21(1):262. doi: 10.1186/s12911-021-01623-6. https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-0... 10.1186/s12911-021-01623-6 - DOI - DOI - PMC - PubMed
LinkOut - more resources
Full Text Sources