Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec:201:110542.
doi: 10.1016/j.radonc.2024.110542. Epub 2024 Sep 17.

Artificial intelligence uncertainty quantification in radiotherapy applications - A scoping review

Affiliations

Artificial intelligence uncertainty quantification in radiotherapy applications - A scoping review

Kareem A Wahid et al. Radiother Oncol. 2024 Dec.

Abstract

Background/purpose: The use of artificial intelligence (AI) in radiotherapy (RT) is expanding rapidly. However, there exists a notable lack of clinician trust in AI models, underscoring the need for effective uncertainty quantification (UQ) methods. The purpose of this study was to scope existing literature related to UQ in RT, identify areas of improvement, and determine future directions.

Methods: We followed the PRISMA-ScR scoping review reporting guidelines. We utilized the population (human cancer patients), concept (utilization of AI UQ), context (radiotherapy applications) framework to structure our search and screening process. We conducted a systematic search spanning seven databases, supplemented by manual curation, up to January 2024. Our search yielded a total of 8980 articles for initial review. Manuscript screening and data extraction was performed in Covidence. Data extraction categories included general study characteristics, RT characteristics, AI characteristics, and UQ characteristics.

Results: We identified 56 articles published from 2015 to 2024. 10 domains of RT applications were represented; most studies evaluated auto-contouring (50 %), followed by image-synthesis (13 %), and multiple applications simultaneously (11 %). 12 disease sites were represented, with head and neck cancer being the most common disease site independent of application space (32 %). Imaging data was used in 91 % of studies, while only 13 % incorporated RT dose information. Most studies focused on failure detection as the main application of UQ (60 %), with Monte Carlo dropout being the most commonly implemented UQ method (32 %) followed by ensembling (16 %). 55 % of studies did not share code or datasets.

Conclusion: Our review revealed a lack of diversity in UQ for RT applications beyond auto-contouring. Moreover, we identified a clear need to study additional UQ methods, such as conformal prediction. Our results may incentivize the development of guidelines for reporting and implementation of UQ in RT.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: [KAW serves as an Editorial Board Member for Physics and Imaging in Radiation Oncology. CDF has received travel, speaker honoraria and/or registration fee waiver unrelated to this project from: The American Association for Physicists in Medicine; the University of Alabama-Birmingham; The American Society for Clinical Oncology; The Royal Australian and New Zealand College of Radiologists; The American Society for Radiation Oncology; The Radiological Society of North America; and The European Society for Radiation Oncology].

Figures

Fig. 1.
Fig. 1.
General study characteristics. (A) Stacked barplot showing total number of publications per country by publication type. (B) Heatmap of the number of studies by continent where green indicates a low number of publications and blue indicates a high number of publications; continents where no studies were extracted from are represented in white. (C) Stacked barplot showing code and data availability over time. Each item in the barplots corresponds to one study. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 2.
Fig. 2.
Radiotherapy characteristics. (A) Stacked barplot showing cancer disease site per each radiotherapy application domain. “Other” category for cancer type included cervical, liver, esophageal, pancreatic, cardiac, breast, pelvic. “Other” category for radiotherapy application included nodal classification, tumor growth modeling, and image correction. (B) Stacked barplot showing additional data per each imaging modality represented. “Other” category for additional data included registration transforms, respiratory trace, K-space, fiducial, clinical data, target + clinical data, dose + clinical data, and dose + clinical data + target + probability map. Each item in the barplots correspond to one study.
Fig. 3.
Fig. 3.
Artificial intelligence characteristics. (A) Scatter plot showing number of training, validation, and testing patients used in studies. Only studies that explicitly reported patient-level sample sizes are included. The three studies with the highest sample sizes in each category are annotated. (B) Bar plot showing types of testing strategies used in studies. Each item in the barplot corresponds to one study.
Fig. 4.
Fig. 4.
Uncertainty quantification characteristics. (A) Tree map of uncertainty quantification applications represented in the studies. (B) Tree map of uncertainty quantification methods represented in the studies. (C) Tree map of uncertainty quantification metrics represented in the studies. Each item in the tree maps correspond to a reported item (could be multiple per study).

Update of

References

    1. Rajpurkar P, Chen E, Banerjee O, Topol EJ. AI in health and medicine. Nat Med 2022;28:31–8. - PubMed
    1. Begoli E, Bhattacharya T, Kusnezov D. The need for uncertainty quantification in machine-assisted medical decision making. Nat Mach Intell 2019;1:20–3.
    1. Shashikumar SP, Wardi G, Malhotra A, Nemati S. Artificial intelligence sepsis prediction algorithm learns to say “I don’t know”. NPJ Digit Med 2021;4:134. - PMC - PubMed
    1. Abdar M, Khosravi A, Islam SMS, Rajendra Acharya U, Vasilakos AV. The need for quantification of uncertainty in artificial intelligence for clinical data analysis: increasing the level of trust in the decision-making process. IEEE Syst Man Cybern Mag 2022;8:28–40.
    1. Faghani S, Moassefi M, Rouzrokh P, Khosravi B, Baffour FI, Ringler MD, et al. Quantifying uncertainty in deep learning of radiologic images. Radiology 2023; 308:e222217. - PubMed

Publication types