. 2024 Oct;34(10):6639-6651.

doi: 10.1007/s00330-024-10714-7. Epub 2024 Mar 27.

Enhancing a deep learning model for pulmonary nodule malignancy risk estimation in chest CT with uncertainty estimation

Affiliations

¹ Diagnostic Imaging Analysis Group, Medical Imaging Department, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA, Nijmegen, the Netherlands. dre.peeters@radboudumc.nl.
² Diagnostic Imaging Analysis Group, Medical Imaging Department, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA, Nijmegen, the Netherlands.
³ Department of Medicine, Section of Pulmonary Medicine, Herlev-Gentofte Hospital, Hellerup, Denmark.
⁴ Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark.
⁵ Radiology Department, Meander Medical Center, Maatweg 3, 3813 TZ, Amersfoort, The Netherlands.
⁶ Department of Radiology, University Medical Center Groningen, University of Groningen, Hanzeplein 1, 9700RB, Groningen, The Netherlands.

PMID: 38536463
PMCID: PMC11399205
DOI: 10.1007/s00330-024-10714-7

Enhancing a deep learning model for pulmonary nodule malignancy risk estimation in chest CT with uncertainty estimation

Dré Peeters et al. Eur Radiol. 2024 Oct.

. 2024 Oct;34(10):6639-6651.

doi: 10.1007/s00330-024-10714-7. Epub 2024 Mar 27.

Authors

Affiliations

¹ Diagnostic Imaging Analysis Group, Medical Imaging Department, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA, Nijmegen, the Netherlands. dre.peeters@radboudumc.nl.
² Diagnostic Imaging Analysis Group, Medical Imaging Department, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA, Nijmegen, the Netherlands.
³ Department of Medicine, Section of Pulmonary Medicine, Herlev-Gentofte Hospital, Hellerup, Denmark.
⁴ Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark.
⁵ Radiology Department, Meander Medical Center, Maatweg 3, 3813 TZ, Amersfoort, The Netherlands.
⁶ Department of Radiology, University Medical Center Groningen, University of Groningen, Hanzeplein 1, 9700RB, Groningen, The Netherlands.

PMID: 38536463
PMCID: PMC11399205
DOI: 10.1007/s00330-024-10714-7

Abstract

Objective: To investigate the effect of uncertainty estimation on the performance of a Deep Learning (DL) algorithm for estimating malignancy risk of pulmonary nodules.

Methods and materials: In this retrospective study, we integrated an uncertainty estimation method into a previously developed DL algorithm for nodule malignancy risk estimation. Uncertainty thresholds were developed using CT data from the Danish Lung Cancer Screening Trial (DLCST), containing 883 nodules (65 malignant) collected between 2004 and 2010. We used thresholds on the 90th and 95th percentiles of the uncertainty score distribution to categorize nodules into certain and uncertain groups. External validation was performed on clinical CT data from a tertiary academic center containing 374 nodules (207 malignant) collected between 2004 and 2012. DL performance was measured using area under the ROC curve (AUC) for the full set of nodules, for the certain cases and for the uncertain cases. Additionally, nodule characteristics were compared to identify trends for inducing uncertainty.

Results: The DL algorithm performed significantly worse in the uncertain group compared to the certain group of DLCST (AUC 0.62 (95% CI: 0.49, 0.76) vs 0.93 (95% CI: 0.88, 0.97); p < .001) and the clinical dataset (AUC 0.62 (95% CI: 0.50, 0.73) vs 0.90 (95% CI: 0.86, 0.94); p < .001). The uncertain group included larger benign nodules as well as more part-solid and non-solid nodules than the certain group.

Conclusion: The integrated uncertainty estimation showed excellent performance for identifying uncertain cases in which the DL-based nodule malignancy risk estimation algorithm had significantly worse performance.

Clinical relevance statement: Deep Learning algorithms often lack the ability to gauge and communicate uncertainty. For safe clinical implementation, uncertainty estimation is of pivotal importance to identify cases where the deep learning algorithm harbors doubt in its prediction.

Key points: • Deep learning (DL) algorithms often lack uncertainty estimation, which potentially reduce the risk of errors and improve safety during clinical adoption of the DL algorithm. • Uncertainty estimation identifies pulmonary nodules in which the discriminative performance of the DL algorithm is significantly worse. • Uncertainty estimation can further enhance the benefits of the DL algorithm and improve its safety and trustworthiness.

Keywords: Deep learning; Multiple pulmonary nodules; Tomography (X-ray computed); Uncertainty.

PubMed Disclaimer

Conflict of interest statement

The authors declare the following competing interests:

MP receives grants from Canon Medical Systems, Siemens Healthineers; royalties from Mevis Medical Solutions; and payment for lectures from Canon Medical Systems and Siemens Healthineers. The host institution of MP is a minority shareholder in Thirona. He reports no other relationships that are related to the subject matter of the article.

The host institution of CJ receives research grants and royalties from MeVis Medical Solutions, Bremen, Germany, and payment for lectures from Canon Medical Systems. CJ is a collaborator in a public-private research project where Radboudumc collaborates with Philips Medical Systems (Best, the Netherlands). CJ is a member of the Scientific Editorial Board for European Radiology (Imaging Informatics and Artificial Intelligence). He has not taken part in the selection or review processes for this article. He reports no other relationships that are related to the subject matter of the article.

The host institution of HH receives grants from Siemens Healthineers. He reports no other relationships that are related to the subject matter of the article.

RV is supported by an institutional research grant from Siemens Healthineers.

Figures

**Fig. 1**
Flow chart of the data collection and selection of pulmonary nodules. a Nodules from the DLCST were used for the development of the uncertainty estimations. b Incidental nodules from the clinical dataset for validation of the uncertainty estimations. Retrieval errors may be due to anonymization, image quality, protected patients, or scan availability. Radiologist score of 0, 1, or 2 indicates no lesion, a benign nodule, or indeterminate nodule in the tumor-bearing lobe, respectively

**Fig. 2**
Schematic overview of how the uncertainty score is utilized to split the dataset into a certain and uncertain group. A nodule block (50x50x50mm) is used as input of the DL algorithm that outputs a malignancy risk and uncertainty score. The uncertainty score is determined based on the score of the individual algorithms in the ensemble. A 90th/95th percentile cut-off value on the uncertainty distribution of all nodules in the dataset is used to split it into a certain and uncertain group to compare algorithm performance. The uncertainty distribution is based on the mean entropy of the individual outputs of the DL algorithm

**Fig. 3**
AUC for a nodule malignancy risk estimation task when using mean entropy to determine certain and uncertain cases of the DLCST dataset. AUC: area under the receiver operating curve

**Fig. 4**
AUC for a nodule malignancy risk estimation task when using the DLCST 90th and 95th percentile threshold of mean entropy to determine certain and uncertain cases of the clinical dataset. AUC: area under the receiver operating curve

**Fig. 5**
Examples of uncertain cases from the Danish Lung Cancer Screening Trial (DLCST) dataset and the Clinical dataset. Numbers in the bottom right corner of each image indicate the predicted DL malignancy risk, with an extent of color filling in the rings that is proportional to the malignancy risk. A malignancy risk of 0 represents the lowest risk, and 1 represents the highest risk. Arrows indicate the nodule location. DL: Deep Learning Malignancy Risk Estimation. Small: < 6 mm, medium: ≥ 6 to < 8 mm and large: ≥ 8 mm

**Fig. 6**
Examples of certain cases from the Danish Lung Cancer Screening Trial (DLCST) dataset and the Clinical dataset. Numbers in the bottom right corner of each image indicate the predicted DL malignancy risk, with an extent of color filling in the rings that is proportional to the malignancy risk. A malignancy risk of 0 represents the lowest risk, and 1 represents the highest risk. Arrows indicate the nodule location. DL: Deep Learning Malignancy Risk Estimation. Small: < 6 mm, medium: ≥ 6 to < 8 mm, and large: ≥ 8 mm

See this image and copyright information in PMC

Cited by

Diagnostic accuracy of deep learning for the invasiveness assessment of ground-glass nodules with fine segmentation: a systematic review and meta-analysis.
Wu W, Gao C, Wu L, Gao C, Li J, Su Z, Zhong H, Xu M, Sun Z. Wu W, et al. Quant Imaging Med Surg. 2025 Apr 1;15(4):2722-2738. doi: 10.21037/qims-24-1839. Epub 2025 Mar 28. Quant Imaging Med Surg. 2025. PMID: 40235789 Free PMC article.
Artificial Intelligence interpretation of chest radiographs in intensive care. Ready for prime time?
Joskowicz L, Beil M, Sviri S. Joskowicz L, et al. Intensive Care Med. 2025 Jan;51(1):154-156. doi: 10.1007/s00134-024-07725-9. Epub 2024 Nov 20. Intensive Care Med. 2025. PMID: 39565379 No abstract available.

References

1. The National Lung Screening Trial Research Team, Aberle DR, Adams AM et al (2011) Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med. 10.1056/NEJMoa110287310.1056/NEJMoa1102873 - DOI - PMC - PubMed
1. de Koning HJ, van der Aalst CM, de Jong PA et al (2020) Reduced lung-cancer mortality with volume CT screening in a randomized trial. N Engl J Med. 10.1056/NEJMoa1911793 10.1056/NEJMoa1911793 - DOI - PubMed
1. Siegel RL, Miller KD, Fuchs HE, Jemal A (2022) Cancer statistics, 2022. CA Cancer J Clin. 10.3322/caac.21708 10.3322/caac.21708 - DOI - PubMed
1. Limb M (2022) Shortages of radiology and oncology staff putting cancer patients at risk, college warns. BMJ. 10.1136/bmj.o1430 10.1136/bmj.o1430 - DOI - PubMed
1. Wille MM, Dirksen A, Ashraf H et al (2016) Results of the randomized Danish lung cancer screening trial with focus on high-risk profiling. Am J Respir Crit Care Med. 10.1164/rccm.201505-1040OC 10.1164/rccm.201505-1040OC - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

14113/KWF Kankerbestrijding

LinkOut - more resources

Full Text Sources
- PubMed Central
- Springer
Medical
- MedlinePlus Consumer Health Information
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Enhancing a deep learning model for pulmonary nodule malignancy risk estimation in chest CT with uncertainty estimation

Affiliations

Enhancing a deep learning model for pulmonary nodule malignancy risk estimation in chest CT with uncertainty estimation

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Medical