. 2023 Sep:9:e2300191.

doi: 10.1200/GO.23.00191.

Natural Language Processing for the Identification of Incidental Lung Nodules in Computed Tomography Reports: A Quality Control Tool

Affiliations

¹ D'Or Institute for Research and Education (IDOR), Rio de Janeiro, Brazil.
² Radiomics and Augmented Intelligence Laboratory (RAIL), University of Florida, Gainesville, FL.
³ Federal University of Health Sciences of Porto Alegre, Porto Alegre, Brazil.
⁴ Stanford Hospital, Stanford University Medical Center, Palo Alto, CA.

PMID: 37769221
PMCID: PMC10581645
DOI: 10.1200/GO.23.00191

Natural Language Processing for the Identification of Incidental Lung Nodules in Computed Tomography Reports: A Quality Control Tool

Rodrigo Basilio et al. JCO Glob Oncol. 2023 Sep.

. 2023 Sep:9:e2300191.

doi: 10.1200/GO.23.00191.

Authors

Affiliations

¹ D'Or Institute for Research and Education (IDOR), Rio de Janeiro, Brazil.
² Radiomics and Augmented Intelligence Laboratory (RAIL), University of Florida, Gainesville, FL.
³ Federal University of Health Sciences of Porto Alegre, Porto Alegre, Brazil.
⁴ Stanford Hospital, Stanford University Medical Center, Palo Alto, CA.

PMID: 37769221
PMCID: PMC10581645
DOI: 10.1200/GO.23.00191

Erratum in

Erratum: Natural Language Processing for the Identification of Incidental Lung Nodules in Computed Tomography Reports: A Quality Control Tool.
[No authors listed] [No authors listed] JCO Glob Oncol. 2024 Jan;10:e2300456. doi: 10.1200/GO.23.00456. JCO Glob Oncol. 2024. PMID: 38181321 Free PMC article. No abstract available.

Abstract

Purpose: To evaluate the diagnostic performance of a natural language processing (NLP) model in detecting incidental lung nodules (ILNs) in unstructured chest computed tomography (CT) reports.

Methods: All unstructured consecutive reports of chest CT scans performed at a tertiary hospital between 2020 and 2021 were retrospectively reviewed (n = 21,542) to train the NLP tool. Internal validation was performed using reference readings by two radiologists of both CT scans and reports, using a different external cohort of 300 chest CT scans. Second, external validation was performed in a cohort of all random unstructured chest CT reports from 57 different hospitals conducted in May 2022. A review by the same thoracic radiologists was used as the gold standard. The sensitivity, specificity, and accuracy were calculated.

Results: Of 21,542 CT reports, 484 mentioned at least one ILN (mean age, 71 ± 17.6 [standard deviation] years; women, 52%) and were included in the training set. In the internal validation (n = 300), the NLP tool detected ILN with a sensitivity of 100.0% (95% CI, 97.6 to 100.0), a specificity of 95.9% (95% CI, 91.3 to 98.5), and an accuracy of 98.0% (95% CI, 95.7 to 99.3). In the external validation (n = 977), the NLP tool yielded a sensitivity of 98.4% (95% CI, 94.5 to 99.8), a specificity of 98.6% (95% CI, 97.5 to 99.3), and an accuracy of 98.6% (95% CI, 97.6 to 99.2). Twelve months after the initial reports, 8 (8.60%) patients had a final diagnosis of lung cancer, among which 2 (2.15%) would have been lost to follow-up without the NLP tool.

Conclusion: NLP can be used to identify ILNs in unstructured reports with high accuracy, allowing a timely recall of patients and a potential diagnosis of early-stage lung cancer that might have been lost to follow-up.

PubMed Disclaimer

Conflict of interest statement

The following represents disclosure information provided by authors of this manuscript. All relationships are considered compensated unless otherwise noted. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the subject matter of this manuscript. For more information about ASCO's conflict of interest policy, please refer to www.asco.org/rwc or ascopubs.org/go/authors/author-center.

Open Payments is a public database containing information reported by companies about payments made to US-licensed physicians (Open Payments).

Figures

**FIG 1**
Flow diagram. ^aFleischner Society guidelines published in 2017 by MacMahon et al. CT, computed tomography; ILN, incidental lung nodule; NLP, natural language processing.

**FIG 2**
An example of our information extraction algorithm. (A) Initial dependency tree. (B) Final information chunks.

**FIG 3**
A 64-year-old man presented in the emergency department with acute dyspnea. (A) Axial and (B) coronal enhanced chest CT images demonstrated a spiculated solid pulmonary nodule of 23 mm, located in the right upper lobe, and severe emphysema. The patient was discharged from the emergency department with a final diagnosis of COVID-19 infection and an orientation to arrange an appointment with a thoracic surgeon. However, the patient did not proceed with further investigation and the CT report was identified using our NLP system. The patient was recalled and, after investigation, the diagnosis of lung cancer was confirmed in an early-stage IA (T1cN0M0). CT, computed tomography; NLP, natural language processing.

**FIG 4**
Images of a 58-year-old heavy smoker man who presented in the emergency department with a suspicion of pulmonary embolism. (A) Axial and (B) sagittal enhanced chest CT images demonstrated a spiculated solid pulmonary nodule with a diameter of 6 mm, located in the superior segment of the right lower lobe, and severe emphysema. Although a control scan at 6-12 months was suggested in the initial CT report, the patient lost follow-up, and the case was retrieved using our NLP system. In the control CT scan performed 18 months later, the nodule had doubled in size, as demonstrated in the (C) axial and (D) sagittal enhanced CT images. The final diagnosis was lung cancer, which was staged as T1AN0M0, IA. CT, computed tomography; NLP, natural language processing.

See this image and copyright information in PMC

References

1. Trinidad López C, Delgado Sánchez-Gracián C, Utrera Pérez E, et al. Incidental pulmonary nodules: Characterization and management. Radiología (Engl Ed) 2019;61:357–369. - PubMed
1. Alpert JB, Ko JP. Management of incidental lung nodules: Current strategy and rationale. Radiol Clin North Am. 2018;56:339–351. - PubMed
1. de Koning HJ, van der Aalst CM, de Jong PA, et al. Reduced lung-cancer mortality with volume CT screening in a randomized trial. N Engl J Med. 2020;382:503–513. - PubMed
1. Zheng C, Huang BZ, Agazaryan AA, et al. Natural language processing to identify pulmonary nodules and extract nodule characteristics from radiology reports. Chest. 2021;160:1902–1914. - PubMed
1. Luo JW, Chong JJR. Review of natural language processing in radiology. Neuroimaging Clin N Am. 2020;30:447–458. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Natural Language Processing for the Identification of Incidental Lung Nodules in Computed Tomography Reports: A Quality Control Tool

Affiliations

Natural Language Processing for the Identification of Incidental Lung Nodules in Computed Tomography Reports: A Quality Control Tool

Authors

Affiliations

Erratum in

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical