A Large Language Model to Detect Negated Expressions in Radiology Reports

Yvonne Su^#¹, Yonatan B Babore^#¹, Charles E Kahn Jr^{2

3}

Affiliations

¹ Department of Radiology, Perelman School of Medicine, University of Pennsylvania, 3400 Spruce Street, Philadelphia, 19104, PA, USA.
² Department of Radiology, Perelman School of Medicine, University of Pennsylvania, 3400 Spruce Street, Philadelphia, 19104, PA, USA. ckahn@upenn.edu.
³ Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, USA. ckahn@upenn.edu.

^# Contributed equally.

PMID: 39322813
PMCID: PMC12092861
DOI: 10.1007/s10278-024-01274-9

A Large Language Model to Detect Negated Expressions in Radiology Reports

Yvonne Su et al. J Imaging Inform Med. 2025 Jun.

. 2025 Jun;38(3):1297-1303.

doi: 10.1007/s10278-024-01274-9. Epub 2024 Sep 25.

Authors

Yvonne Su^#¹, Yonatan B Babore^#¹, Charles E Kahn Jr^{2

3}

Affiliations

¹ Department of Radiology, Perelman School of Medicine, University of Pennsylvania, 3400 Spruce Street, Philadelphia, 19104, PA, USA.
² Department of Radiology, Perelman School of Medicine, University of Pennsylvania, 3400 Spruce Street, Philadelphia, 19104, PA, USA. ckahn@upenn.edu.
³ Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, USA. ckahn@upenn.edu.

^# Contributed equally.

PMID: 39322813
PMCID: PMC12092861
DOI: 10.1007/s10278-024-01274-9

Abstract

Natural language processing (NLP) is crucial to extract information accurately from unstructured text to provide insights for clinical decision-making, quality improvement, and medical research. This study compared the performance of a rule-based NLP system and a medical-domain transformer-based model to detect negated concepts in radiology reports. Using a corpus of 984 de-identified radiology reports from a large U.S.-based academic health system (1000 consecutive reports, excluding 16 duplicates), the investigators compared the rule-based medspaCy system and the Clinical Assertion and Negation Classification Bidirectional Encoder Representations from Transformers (CAN-BERT) system to detect negated expressions of terms from RadLex, the Unified Medical Language System Metathesaurus, and the Radiology Gamuts Ontology. Power analysis determined a sample size of 382 terms to achieve α = 0.05 and β = 0.8 for McNemar's test; based on an estimate of 15% negated terms, 2800 randomly selected terms were annotated manually as negated or not negated. Precision, recall, and F1 of the two models were compared using McNemar's test. Of the 2800 terms, 387 (13.8%) were negated. For negation detection, medspaCy attained a recall of 0.795, precision of 0.356, and F1 of 0.492. CAN-BERT achieved a recall of 0.785, precision of 0.768, and F1 of 0.777. Although recall was not significantly different, CAN-BERT had significantly better precision (χ2 = 304.64; p < 0.001). The transformer-based CAN-BERT model detected negated terms in radiology reports with high precision and recall; its precision significantly exceeded that of the rule-based medspaCy system. Use of this system will improve data extraction from textual reports to support information retrieval, AI model training, and discovery of causal relationships.

Keywords: Large language models; Named entity recognition; Natural language processing; Negated expression (negex) detection; Radiology reports.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics Approval: Study protocol approved by the University of Pennsylvania IRB. Informed consent from patients was waived. Competing Interests: The authors declare no competing interests.

Figures

**Fig. 1**
Examples of negation detection. A MedspaCy correctly identifies the term of interest as negated; CAN-BERT does not. B Both medspaCy and CAN-BERT incorrectly identified the term of interest as negated. C CAN-BERT correctly identifies the term as negated; medspaCy does not. D Both medspaCy and CAN-BERT correctly identify the term as not negated

**Fig. 2**
Receiver operating characteristic (ROC) curves for CAN-BERT (“BERT”) and medspaCy. AUC = area under the ROC curve

See this image and copyright information in PMC

References

1. Landolsi MY, Hlaoua L, Ben Romdhane L: Information extraction from electronic medical documents: state of the art and future research directions. Knowl Inf Syst 65:463-516, 2023 - PMC - PubMed
1. Casey A, et al.: A systematic review of natural language processing applied to radiology reports. BMC Med Inform Decis Mak 21:179, 2021 - PMC - PubMed
1. Linna N, Kahn CE Jr.: Applications of natural language processing in radiology: A systematic review. Int J Med Inform 163:104779, 2022 - PubMed
1. Lakhani P, Kim W, Langlotz CP: Automated detection of critical results in radiology reports. J Digit Imaging 25:30-36, 2012 - PMC - PubMed
1. Hripcsak G, Austin JH, Alderson PO, Friedman C: Use of natural language processing to translate clinical information from a database of 889,921 chest radiographic reports. Radiology 224:157-163, 2002 - PubMed

MeSH terms

Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A Large Language Model to Detect Negated Expressions in Radiology Reports

Affiliations

A Large Language Model to Detect Negated Expressions in Radiology Reports

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources