Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Sep:9:e2300191.
doi: 10.1200/GO.23.00191.

Natural Language Processing for the Identification of Incidental Lung Nodules in Computed Tomography Reports: A Quality Control Tool

Affiliations

Natural Language Processing for the Identification of Incidental Lung Nodules in Computed Tomography Reports: A Quality Control Tool

Rodrigo Basilio et al. JCO Glob Oncol. 2023 Sep.

Erratum in

Abstract

Purpose: To evaluate the diagnostic performance of a natural language processing (NLP) model in detecting incidental lung nodules (ILNs) in unstructured chest computed tomography (CT) reports.

Methods: All unstructured consecutive reports of chest CT scans performed at a tertiary hospital between 2020 and 2021 were retrospectively reviewed (n = 21,542) to train the NLP tool. Internal validation was performed using reference readings by two radiologists of both CT scans and reports, using a different external cohort of 300 chest CT scans. Second, external validation was performed in a cohort of all random unstructured chest CT reports from 57 different hospitals conducted in May 2022. A review by the same thoracic radiologists was used as the gold standard. The sensitivity, specificity, and accuracy were calculated.

Results: Of 21,542 CT reports, 484 mentioned at least one ILN (mean age, 71 ± 17.6 [standard deviation] years; women, 52%) and were included in the training set. In the internal validation (n = 300), the NLP tool detected ILN with a sensitivity of 100.0% (95% CI, 97.6 to 100.0), a specificity of 95.9% (95% CI, 91.3 to 98.5), and an accuracy of 98.0% (95% CI, 95.7 to 99.3). In the external validation (n = 977), the NLP tool yielded a sensitivity of 98.4% (95% CI, 94.5 to 99.8), a specificity of 98.6% (95% CI, 97.5 to 99.3), and an accuracy of 98.6% (95% CI, 97.6 to 99.2). Twelve months after the initial reports, 8 (8.60%) patients had a final diagnosis of lung cancer, among which 2 (2.15%) would have been lost to follow-up without the NLP tool.

Conclusion: NLP can be used to identify ILNs in unstructured reports with high accuracy, allowing a timely recall of patients and a potential diagnosis of early-stage lung cancer that might have been lost to follow-up.

PubMed Disclaimer

Conflict of interest statement

The following represents disclosure information provided by authors of this manuscript. All relationships are considered compensated unless otherwise noted. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the subject matter of this manuscript. For more information about ASCO's conflict of interest policy, please refer to www.asco.org/rwc or ascopubs.org/go/authors/author-center.

Open Payments is a public database containing information reported by companies about payments made to US-licensed physicians (Open Payments).

Figures

FIG 1
FIG 1
Flow diagram. aFleischner Society guidelines published in 2017 by MacMahon et al. CT, computed tomography; ILN, incidental lung nodule; NLP, natural language processing.
FIG 2
FIG 2
An example of our information extraction algorithm. (A) Initial dependency tree. (B) Final information chunks.
FIG 3
FIG 3
A 64-year-old man presented in the emergency department with acute dyspnea. (A) Axial and (B) coronal enhanced chest CT images demonstrated a spiculated solid pulmonary nodule of 23 mm, located in the right upper lobe, and severe emphysema. The patient was discharged from the emergency department with a final diagnosis of COVID-19 infection and an orientation to arrange an appointment with a thoracic surgeon. However, the patient did not proceed with further investigation and the CT report was identified using our NLP system. The patient was recalled and, after investigation, the diagnosis of lung cancer was confirmed in an early-stage IA (T1cN0M0). CT, computed tomography; NLP, natural language processing.
FIG 4
FIG 4
Images of a 58-year-old heavy smoker man who presented in the emergency department with a suspicion of pulmonary embolism. (A) Axial and (B) sagittal enhanced chest CT images demonstrated a spiculated solid pulmonary nodule with a diameter of 6 mm, located in the superior segment of the right lower lobe, and severe emphysema. Although a control scan at 6-12 months was suggested in the initial CT report, the patient lost follow-up, and the case was retrieved using our NLP system. In the control CT scan performed 18 months later, the nodule had doubled in size, as demonstrated in the (C) axial and (D) sagittal enhanced CT images. The final diagnosis was lung cancer, which was staged as T1AN0M0, IA. CT, computed tomography; NLP, natural language processing.

References

    1. Trinidad López C, Delgado Sánchez-Gracián C, Utrera Pérez E, et al. Incidental pulmonary nodules: Characterization and management. Radiología (Engl Ed) 2019;61:357–369. - PubMed
    1. Alpert JB, Ko JP. Management of incidental lung nodules: Current strategy and rationale. Radiol Clin North Am. 2018;56:339–351. - PubMed
    1. de Koning HJ, van der Aalst CM, de Jong PA, et al. Reduced lung-cancer mortality with volume CT screening in a randomized trial. N Engl J Med. 2020;382:503–513. - PubMed
    1. Zheng C, Huang BZ, Agazaryan AA, et al. Natural language processing to identify pulmonary nodules and extract nodule characteristics from radiology reports. Chest. 2021;160:1902–1914. - PubMed
    1. Luo JW, Chong JJR. Review of natural language processing in radiology. Neuroimaging Clin N Am. 2020;30:447–458. - PubMed

Publication types