Natural Language Processing for the Identification of Incidental Lung Nodules in Computed Tomography Reports: A Quality Control Tool
- PMID: 37769221
- PMCID: PMC10581645
- DOI: 10.1200/GO.23.00191
Natural Language Processing for the Identification of Incidental Lung Nodules in Computed Tomography Reports: A Quality Control Tool
Erratum in
-
Erratum: Natural Language Processing for the Identification of Incidental Lung Nodules in Computed Tomography Reports: A Quality Control Tool.JCO Glob Oncol. 2024 Jan;10:e2300456. doi: 10.1200/GO.23.00456. JCO Glob Oncol. 2024. PMID: 38181321 Free PMC article. No abstract available.
Abstract
Purpose: To evaluate the diagnostic performance of a natural language processing (NLP) model in detecting incidental lung nodules (ILNs) in unstructured chest computed tomography (CT) reports.
Methods: All unstructured consecutive reports of chest CT scans performed at a tertiary hospital between 2020 and 2021 were retrospectively reviewed (n = 21,542) to train the NLP tool. Internal validation was performed using reference readings by two radiologists of both CT scans and reports, using a different external cohort of 300 chest CT scans. Second, external validation was performed in a cohort of all random unstructured chest CT reports from 57 different hospitals conducted in May 2022. A review by the same thoracic radiologists was used as the gold standard. The sensitivity, specificity, and accuracy were calculated.
Results: Of 21,542 CT reports, 484 mentioned at least one ILN (mean age, 71 ± 17.6 [standard deviation] years; women, 52%) and were included in the training set. In the internal validation (n = 300), the NLP tool detected ILN with a sensitivity of 100.0% (95% CI, 97.6 to 100.0), a specificity of 95.9% (95% CI, 91.3 to 98.5), and an accuracy of 98.0% (95% CI, 95.7 to 99.3). In the external validation (n = 977), the NLP tool yielded a sensitivity of 98.4% (95% CI, 94.5 to 99.8), a specificity of 98.6% (95% CI, 97.5 to 99.3), and an accuracy of 98.6% (95% CI, 97.6 to 99.2). Twelve months after the initial reports, 8 (8.60%) patients had a final diagnosis of lung cancer, among which 2 (2.15%) would have been lost to follow-up without the NLP tool.
Conclusion: NLP can be used to identify ILNs in unstructured reports with high accuracy, allowing a timely recall of patients and a potential diagnosis of early-stage lung cancer that might have been lost to follow-up.
Conflict of interest statement
The following represents disclosure information provided by authors of this manuscript. All relationships are considered compensated unless otherwise noted. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the subject matter of this manuscript. For more information about ASCO's conflict of interest policy, please refer to
Open Payments is a public database containing information reported by companies about payments made to US-licensed physicians (
Figures




References
-
- Trinidad López C, Delgado Sánchez-Gracián C, Utrera Pérez E, et al. Incidental pulmonary nodules: Characterization and management. Radiología (Engl Ed) 2019;61:357–369. - PubMed
-
- Alpert JB, Ko JP. Management of incidental lung nodules: Current strategy and rationale. Radiol Clin North Am. 2018;56:339–351. - PubMed
-
- de Koning HJ, van der Aalst CM, de Jong PA, et al. Reduced lung-cancer mortality with volume CT screening in a randomized trial. N Engl J Med. 2020;382:503–513. - PubMed
-
- Zheng C, Huang BZ, Agazaryan AA, et al. Natural language processing to identify pulmonary nodules and extract nodule characteristics from radiology reports. Chest. 2021;160:1902–1914. - PubMed
-
- Luo JW, Chong JJR. Review of natural language processing in radiology. Neuroimaging Clin N Am. 2020;30:447–458. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical