Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 21;9(7):101521.
doi: 10.1016/j.adro.2024.101521. eCollection 2024 Jul.

Evolving Horizons in Radiation Therapy Auto-Contouring: Distilling Insights, Embracing Data-Centric Frameworks, and Moving Beyond Geometric Quantification

Affiliations

Evolving Horizons in Radiation Therapy Auto-Contouring: Distilling Insights, Embracing Data-Centric Frameworks, and Moving Beyond Geometric Quantification

Kareem A Wahid et al. Adv Radiat Oncol. .
No abstract available

PubMed Disclaimer

Conflict of interest statement

K.A.W. serves as an Editorial Board Member for Physics and Imaging in Radiation Oncology. C.D.F. has received travel, speaker honoraria, and/or registration fee waivers unrelated to this project from The American Association for Physicists in Medicine, the University of Alabama-Birmingham, The American Society for Clinical Oncology, The Royal Australian and New Zealand College of Radiologists, The American Society for Radiation Oncology, The Radiologic Society of North America, and The European Society for Radiation Oncology. The other authors have no interests to disclose. During the preparation of this work, the authors used ChatGPT (GPT-4 architecture; ChatGPT September 25 Version) to improve the grammatical accuracy and semantic structure of portions of the text. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.

Figures

Figure 1
Figure 1
A deep learning model trained with a few highly consistent, ie, high-quality, contours (green) was more closely aligned to the reference standard test data than a model trained with many inconsistent contours (red) for various head and neck cancer radiation therapy structures. The 95% Hausdorff distance (HD95) (A) and mean distance to agreement (mDTA) (B) were used as geometric performance quantification metrics. Lower values for both metrics indicate better performance. Reprinted from Henderson et al.
Figure 2
Figure 2
Consensus from a limited number of nonexpert contours can approximate expert reference standard benchmarks. A specific plot is shown for the left parotid gland in a head and neck cancer case using the volumetric Dice similarity coefficient (DSC) as a performance quantification metric. The simultaneous truth and performance level estimation (STAPLE) algorithm was used to generate consensus contours. To explore consensus quality dynamics based on the number of nonexpert inputs, bootstrap resampling selected random nonexpert subsets with replacement to form consensus contours, which were then compared with expert consensus. Each dot represents the median from 100 bootstrap iterations with a 95% confidence interval (shaded area). The black dotted line indicates the median expert DSC interobserver variability (IOV). The gray dotted line indicates DSC performance for the maximum number of nonexperts used in the consensus. For this example, 3 to 4 nonexperts can approximate expert IOV benchmarks. As the number of nonexperts in the consensus contour increases, performance generally improves before plateauing. Adapted from Lin et al.
Figure 3
Figure 3
Relatively small training sample sizes are needed to reach high geometric performance for deep learning auto-contouring models. The percentage of the volumetric Dice similarity coefficient (DSC) using different training sample sizes relative to the maximum DSC for individual contour structures is shown in different colors. Most organ-at-risk structures required ∼40 patient samples to achieve 95% of the maximum possible performance; notably, lenses and optic nerves required 200 samples to achieve 95% of the maximum possible performance. Reprinted from Fang et al.
Figure 4
Figure 4
HEad and neCK TumOR (HECKTOR) data challenge auto-contouring geometric performance saturation over time. Contouring performance was measured by the Dice similarity coefficient of primary tumor predictions on the test set for each year of the challenge (2020, 2021, and 2022). Training and test set patient sample sizes are shown in parenthesis. Red and blue dots correspond to the best performance (mean of the top 3 teams for that year, ie, winners) and average performance (median across all the participating teams for that year), respectively. Seventeen, 22, and 22 teams had scores reported for the challenge in the years 2020, 2021, and 2022, respectively. The gray dotted line corresponds to a clinician expert interobserver variability benchmark. Data were derived from corresponding yearly HECKTOR conference proceedings.

Update of

References

    1. Cardenas CE, Yang J, Anderson BM, Court LE, Brock KB. Advances in auto-segmentation. Semin Radiat Oncol. 2019;29:185–197. - PubMed
    1. Santoro M, Strolin S, Paolani G, et al. Recent applications of artificial intelligence in radiotherapy: Where we are and beyond. NATO Adv Sci Inst Ser E Appl Sci. 2022;12:3223.
    1. Naqa IE. Artificial Intelligence in Radiation Oncology and Biomedical Physics. CRC Press; 2023. AI applications in radiation therapy and medical physics.
    1. Hamid OH. 2022 8th International Conference on Information Technology Trends (ITT) 2022. From model-centric to data-centric AI: A paradigm shift or rather a complementary approach? pp. 196–199.
    1. Mackay K, Bernstein D, Glocker B, Kamnitsas K, Taylor A. A review of the metrics used to assess auto-contouring systems in radiotherapy. Clin Oncol. 2023;35:354–369. - PubMed

LinkOut - more resources