Human factors in the clinical implementation of deep learning-based automated contouring of pelvic organs at risk for MRI-guided radiotherapy
- PMID: 37646527
- DOI: 10.1002/mp.16676
Human factors in the clinical implementation of deep learning-based automated contouring of pelvic organs at risk for MRI-guided radiotherapy
Abstract
Purpose: Deep neural nets have revolutionized the science of auto-segmentation and present great promise for treatment planning automation. However, little data exists regarding clinical implementation and human factors. We evaluated the performance and clinical implementation of a novel deep learning-based auto-contouring workflow for 0.35T magnetic resonance imaging (MRI)-guided pelvic radiotherapy, focusing on automation bias and objective measures of workflow savings.
Methods: An auto-contouring model was developed using a UNet-derived architecture for the femoral heads, bladder, and rectum in 0.35T MR images. Training data was taken from 75 patients treated with MRI-guided radiotherapy at our institution. The model was tested against 20 retrospective cases outside the training set, and subsequently was clinically implemented. Usability was evaluated on the first 30 clinical cases by computing Dice coefficient (DSC), Hausdorff distance (HD), and the fraction of slices that were used un-modified by planners. Final contours were retrospectively reviewed by an experienced planner and clinical significance of deviations was graded as negligible, low, moderate, and high probability of leading to actionable dosimetric variations. In order to assess whether the use of auto-contouring led to final contours more or less in agreement with an objective standard, 10 pre-treatment and 10 post-treatment blinded cases were re-contoured from scratch by three expert planners to get expert consensus contours (EC). EC was compared to clinically used (CU) contours using DSC. Student's t-test and Levene's statistic were used to test statistical significance of differences in mean and standard deviation, respectively. Finally, the dosimetric significance of the contour differences were assessed by comparing the difference in bladder and rectum maximum point doses between EC and CU before and after the introduction of automation.
Results: Median (interquartile range) DSC for the retrospective test data were 0.92(0.02), 0.92(0.06), 0.93(0.06), 0.87(0.04) for the post-processed contours for the right and left femoral heads, bladder, and rectum, respectively. Post-implementation median DSC were 1.0(0.0), 1.0(0.0), 0.98(0.04), and 0.98(0.06), respectively. For each organ, 96.2, 95.4, 59.5, and 68.21 percent of slices were used unmodified by the planner. DSC between EC and pre-implementation CU contours were 0.91(0.05*), 0.91*(0.05*), 0.95(0.04), and 0.88(0.04) for right and left femoral heads, bladder, and rectum, respectively. The corresponding DSC for post-implementation CU contours were 0.93(0.02*), 0.93*(0.01*), 0.96(0.01), and 0.85(0.02) (asterisks indicate statistically significant difference). In a retrospective review of contours used for planning, a total of four deviating slices in two patients were graded as low potential clinical significance. No deviations were graded as moderate or high. Mean differences between EC and CU rectum max-doses were 0.1 ± 2.6 Gy and -0.9 ± 2.5 Gy for pre- and post-implementation, respectively. Mean differences between EC and CU bladder/bladder wall max-doses were -0.9 ± 4.1 Gy and 0.0 ± 0.6 Gy for pre- and post-implementation, respectively. These differences were not statistically significant according to Student's t-test.
Conclusion: We have presented an analysis of the clinical implementation of a novel auto-contouring workflow. Substantial workflow savings were obtained. The introduction of auto-contouring into the clinical workflow changed the contouring behavior of planners. Automation bias was observed, but it had little deleterious effect on treatment planning.
Keywords: auto-contouring; automation bias; deep-learning.
© 2023 American Association of Physicists in Medicine.
Similar articles
-
Open-source deep-learning models for segmentation of normal structures for prostatic and gynecological high-dose-rate brachytherapy: Comparison of architectures.J Appl Clin Med Phys. 2025 Jun;26(6):e70089. doi: 10.1002/acm2.70089. Epub 2025 Apr 5. J Appl Clin Med Phys. 2025. PMID: 40186596 Free PMC article.
-
Evaluating the dosimetric impact of deep-learning-based auto-segmentation in prostate cancer radiotherapy: Insights into real-world clinical implementation and inter-observer variability.J Appl Clin Med Phys. 2025 Mar;26(3):e14569. doi: 10.1002/acm2.14569. Epub 2024 Dec 1. J Appl Clin Med Phys. 2025. PMID: 39616629 Free PMC article.
-
Evaluating the clinical acceptability of deep learning contours of prostate and organs-at-risk in an automated prostate treatment planning process.Med Phys. 2022 Apr;49(4):2570-2581. doi: 10.1002/mp.15525. Epub 2022 Feb 21. Med Phys. 2022. PMID: 35147216
-
A Review of the Metrics Used to Assess Auto-Contouring Systems in Radiotherapy.Clin Oncol (R Coll Radiol). 2023 Jun;35(6):354-369. doi: 10.1016/j.clon.2023.01.016. Epub 2023 Jan 31. Clin Oncol (R Coll Radiol). 2023. PMID: 36803407 Review.
-
Evaluating deep learning auto-contouring for lung radiation therapy: A review of accuracy, variability, efficiency and dose, in target volumes and organs at risk.Phys Imaging Radiat Oncol. 2025 Feb 21;33:100736. doi: 10.1016/j.phro.2025.100736. eCollection 2025 Jan. Phys Imaging Radiat Oncol. 2025. PMID: 40104215 Free PMC article. Review.
Cited by
-
Open-source deep-learning models for segmentation of normal structures for prostatic and gynecological high-dose-rate brachytherapy: Comparison of architectures.J Appl Clin Med Phys. 2025 Jun;26(6):e70089. doi: 10.1002/acm2.70089. Epub 2025 Apr 5. J Appl Clin Med Phys. 2025. PMID: 40186596 Free PMC article.
-
Recent trends in AI applications for pelvic MRI: a comprehensive review.Radiol Med. 2024 Sep;129(9):1275-1287. doi: 10.1007/s11547-024-01861-4. Epub 2024 Aug 3. Radiol Med. 2024. PMID: 39096356 Review.
-
Evaluation of deep learning-based target auto-segmentation for Magnetic Resonance Imaging-guided cervix brachytherapy.Phys Imaging Radiat Oncol. 2024 Nov 3;32:100669. doi: 10.1016/j.phro.2024.100669. eCollection 2024 Oct. Phys Imaging Radiat Oncol. 2024. PMID: 39559487 Free PMC article.
References
REFERENCES
-
- Fiorino C, Reni M, Bolognesi A, Cattaneo GM, Calandrino R. Intra- and inter-observer variability in contouring prostate and seminal vesicles: implications for conformal treatment planning. Radiother Oncol. 1998;47(3):285-292. doi:10.1016/s0167-8140(98)00021-8
-
- Roach D, Holloway LC, Jameson MG, et al. Multi-observer contouring of male pelvic anatomy: highly variable agreement across conventional and emerging structures of interest. J Med Imaging Radiat Oncol. 2019;63(2):264-271. doi:10.1111/1754-9485.12844
-
- Vinod SK, Min M, Jameson MG, Holloway LC. A review of interventions to reduce inter-observer variability in volume delineation in radiation oncology. J Med Imaging Radiat Oncol. 2016;60(3):393-406. doi:10.1111/1754-9485.12462
-
- Peng YL, Chen L, Shen GZ, et al. Interobserver variations in the delineation of target volumes and organs at risk and their impact on dose distribution in intensity-modulated radiation therapy for nasopharyngeal carcinoma. Oral Oncol. 2018;82:1-7. doi:10.1016/j.oraloncology.2018.04.025
-
- Sklansky J. Image segmentation and feature extraction. IEEE Trans Syst Man Cybern Syst. 1978;8(4):237-247. doi:10.1109/TSMC.1978.4309944
Grants and funding
LinkOut - more resources
Full Text Sources