Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct;50(10):5969-5977.
doi: 10.1002/mp.16676. Epub 2023 Aug 30.

Human factors in the clinical implementation of deep learning-based automated contouring of pelvic organs at risk for MRI-guided radiotherapy

Affiliations

Human factors in the clinical implementation of deep learning-based automated contouring of pelvic organs at risk for MRI-guided radiotherapy

Yasin Abdulkadir et al. Med Phys. 2023 Oct.

Abstract

Purpose: Deep neural nets have revolutionized the science of auto-segmentation and present great promise for treatment planning automation. However, little data exists regarding clinical implementation and human factors. We evaluated the performance and clinical implementation of a novel deep learning-based auto-contouring workflow for 0.35T magnetic resonance imaging (MRI)-guided pelvic radiotherapy, focusing on automation bias and objective measures of workflow savings.

Methods: An auto-contouring model was developed using a UNet-derived architecture for the femoral heads, bladder, and rectum in 0.35T MR images. Training data was taken from 75 patients treated with MRI-guided radiotherapy at our institution. The model was tested against 20 retrospective cases outside the training set, and subsequently was clinically implemented. Usability was evaluated on the first 30 clinical cases by computing Dice coefficient (DSC), Hausdorff distance (HD), and the fraction of slices that were used un-modified by planners. Final contours were retrospectively reviewed by an experienced planner and clinical significance of deviations was graded as negligible, low, moderate, and high probability of leading to actionable dosimetric variations. In order to assess whether the use of auto-contouring led to final contours more or less in agreement with an objective standard, 10 pre-treatment and 10 post-treatment blinded cases were re-contoured from scratch by three expert planners to get expert consensus contours (EC). EC was compared to clinically used (CU) contours using DSC. Student's t-test and Levene's statistic were used to test statistical significance of differences in mean and standard deviation, respectively. Finally, the dosimetric significance of the contour differences were assessed by comparing the difference in bladder and rectum maximum point doses between EC and CU before and after the introduction of automation.

Results: Median (interquartile range) DSC for the retrospective test data were 0.92(0.02), 0.92(0.06), 0.93(0.06), 0.87(0.04) for the post-processed contours for the right and left femoral heads, bladder, and rectum, respectively. Post-implementation median DSC were 1.0(0.0), 1.0(0.0), 0.98(0.04), and 0.98(0.06), respectively. For each organ, 96.2, 95.4, 59.5, and 68.21 percent of slices were used unmodified by the planner. DSC between EC and pre-implementation CU contours were 0.91(0.05*), 0.91*(0.05*), 0.95(0.04), and 0.88(0.04) for right and left femoral heads, bladder, and rectum, respectively. The corresponding DSC for post-implementation CU contours were 0.93(0.02*), 0.93*(0.01*), 0.96(0.01), and 0.85(0.02) (asterisks indicate statistically significant difference). In a retrospective review of contours used for planning, a total of four deviating slices in two patients were graded as low potential clinical significance. No deviations were graded as moderate or high. Mean differences between EC and CU rectum max-doses were 0.1 ± 2.6 Gy and -0.9 ± 2.5 Gy for pre- and post-implementation, respectively. Mean differences between EC and CU bladder/bladder wall max-doses were -0.9 ± 4.1 Gy and 0.0 ± 0.6 Gy for pre- and post-implementation, respectively. These differences were not statistically significant according to Student's t-test.

Conclusion: We have presented an analysis of the clinical implementation of a novel auto-contouring workflow. Substantial workflow savings were obtained. The introduction of auto-contouring into the clinical workflow changed the contouring behavior of planners. Automation bias was observed, but it had little deleterious effect on treatment planning.

Keywords: auto-contouring; automation bias; deep-learning.

PubMed Disclaimer

Similar articles

Cited by

References

REFERENCES

    1. Fiorino C, Reni M, Bolognesi A, Cattaneo GM, Calandrino R. Intra- and inter-observer variability in contouring prostate and seminal vesicles: implications for conformal treatment planning. Radiother Oncol. 1998;47(3):285-292. doi:10.1016/s0167-8140(98)00021-8
    1. Roach D, Holloway LC, Jameson MG, et al. Multi-observer contouring of male pelvic anatomy: highly variable agreement across conventional and emerging structures of interest. J Med Imaging Radiat Oncol. 2019;63(2):264-271. doi:10.1111/1754-9485.12844
    1. Vinod SK, Min M, Jameson MG, Holloway LC. A review of interventions to reduce inter-observer variability in volume delineation in radiation oncology. J Med Imaging Radiat Oncol. 2016;60(3):393-406. doi:10.1111/1754-9485.12462
    1. Peng YL, Chen L, Shen GZ, et al. Interobserver variations in the delineation of target volumes and organs at risk and their impact on dose distribution in intensity-modulated radiation therapy for nasopharyngeal carcinoma. Oral Oncol. 2018;82:1-7. doi:10.1016/j.oraloncology.2018.04.025
    1. Sklansky J. Image segmentation and feature extraction. IEEE Trans Syst Man Cybern Syst. 1978;8(4):237-247. doi:10.1109/TSMC.1978.4309944

LinkOut - more resources