Natural Language Processing to Identify Pulmonary Nodules and Extract Nodule Characteristics From Radiology Reports

Chengyi Zheng¹, Brian Z Huang², Andranik A Agazaryan³, Beth Creekmur², Thearis A Osuj², Michael K Gould⁴

Affiliations

¹ Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA. Electronic address: Chengyi.X.Zheng@kp.org.
² Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA.
³ Los Angeles Medical Center, Kaiser Permanente Southern California, Los Angeles, CA.
⁴ Department of Health Systems Science, Kaiser Permanente Bernard J. Tyson School of Medicine, Pasadena, CA.

PMID: 34089738
DOI: 10.1016/j.chest.2021.05.048

Natural Language Processing to Identify Pulmonary Nodules and Extract Nodule Characteristics From Radiology Reports

Chengyi Zheng et al. Chest. 2021 Nov.

. 2021 Nov;160(5):1902-1914.

doi: 10.1016/j.chest.2021.05.048. Epub 2021 Jun 4.

Authors

Chengyi Zheng¹, Brian Z Huang², Andranik A Agazaryan³, Beth Creekmur², Thearis A Osuj², Michael K Gould⁴

Affiliations

¹ Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA. Electronic address: Chengyi.X.Zheng@kp.org.
² Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA.
³ Los Angeles Medical Center, Kaiser Permanente Southern California, Los Angeles, CA.
⁴ Department of Health Systems Science, Kaiser Permanente Bernard J. Tyson School of Medicine, Pasadena, CA.

PMID: 34089738
DOI: 10.1016/j.chest.2021.05.048

Abstract

Background: There is an urgent need for population-based studies on managing patients with pulmonary nodules.

Research question: Is it possible to identify pulmonary nodules and associated characteristics using an automated method?

Study design and methods: We revised and refined an existing natural language processing (NLP) algorithm to identify radiology transcripts with pulmonary nodules and greatly expanded its functionality to identify the characteristics of the largest nodule, when present, including size, lobe, laterality, attenuation, calcification, and edge. We compared NLP results with a reference standard of manual transcript review in a random test sample of 200 radiology transcripts. We applied the final automated method to a larger cohort of patients who underwent chest CT scan in an integrated health care system from 2006 to 2016, and described their demographic and clinical characteristics.

Results: In the test sample, the NLP algorithm had very high sensitivity (98.6%; 95% CI, 95.0%-99.8%) and specificity (100%; 95% CI, 93.9%-100%) for identifying pulmonary nodules. For attenuation, edge, and calcification, the NLP algorithm achieved similar accuracies, and it correctly identified the diameter of the largest nodule in 135 of 141 cases (95.7%; 95% CI, 91.0%-98.4%). In the larger cohort, the NLP found 217,771 reports with nodules among 717,304 chest CT reports (30.4%). From 2006 to 2016, the number of reports with nodules increased by 150%, and the mean size of the largest nodule gradually decreased from 11 to 8.9 mm. Radiologists documented the laterality and lobe (90%-95%) more often than the attenuation, calcification, and edge characteristics (11%-14%).

Interpretation: The NLP algorithm identified pulmonary nodules and associated characteristics with high accuracy. In our community practice settings, the documentation of nodule characteristics is incomplete. Our results call for better documentation of nodule findings. The NLP algorithm can be used in population-based studies to identify pulmonary nodules, avoiding labor-intensive chart review.

Keywords: chest CT scan; natural language processing; nodule characteristics; pulmonary nodule; radiology reports.

PubMed Disclaimer

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Natural Language Processing to Identify Pulmonary Nodules and Extract Nodule Characteristics From Radiology Reports

Affiliations

Natural Language Processing to Identify Pulmonary Nodules and Extract Nodule Characteristics From Radiology Reports

Authors

Affiliations

Abstract

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Miscellaneous