Automatic contouring QA method using a deep learning-based autocontouring system

Dong Joo Rhee^{1

2}, Chidinma P Anakwenze Akinfenwa³, Bastien Rigaud⁴, Anuja Jhingran³, Carlos E Cardenas², Lifei Zhang², Surendra Prajapati², Stephen F Kry², Kristy K Brock⁴, Beth M Beadle⁵, William Shaw⁶, Frederika O'Reilly⁶, Jeannette Parkes⁷, Hester Burger⁷, Nazia Fakie⁷, Chris Trauernicht⁸, Hannah Simonds⁹, Laurence E Court²

Affiliations

¹ The University of Texas Graduate School of Biomedical Sciences at Houston, Houston, Texas, USA.
² Department of Radiation Physics, Division of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA.
³ Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA.
⁴ Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA.
⁵ Department of Radiation Oncology, Stanford University School of Medicine, Stanford, California, USA.
⁶ Department of Medical Physics (G68), University of the Free State, Bloemfontein, South Africa.
⁷ Division of Radiation Oncology and Medical Physics, University of Cape Town and Groote Schuur Hospital, Cape Town, South Africa.
⁸ Division of Medical Physics, Stellenbosch University, Tygerberg Academic Hospital, Cape Town, South Africa.
⁹ Division of Radiation Oncology, Stellenbosch University, Tygerberg Academic Hospital, Cape Town, South Africa.

PMID: 35580067
PMCID: PMC9359039
DOI: 10.1002/acm2.13647

Automatic contouring QA method using a deep learning-based autocontouring system

Dong Joo Rhee et al. J Appl Clin Med Phys. 2022 Aug.

. 2022 Aug;23(8):e13647.

doi: 10.1002/acm2.13647. Epub 2022 May 17.

Authors

Affiliations

¹ The University of Texas Graduate School of Biomedical Sciences at Houston, Houston, Texas, USA.
² Department of Radiation Physics, Division of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA.
³ Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA.
⁴ Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA.
⁵ Department of Radiation Oncology, Stanford University School of Medicine, Stanford, California, USA.
⁶ Department of Medical Physics (G68), University of the Free State, Bloemfontein, South Africa.
⁷ Division of Radiation Oncology and Medical Physics, University of Cape Town and Groote Schuur Hospital, Cape Town, South Africa.
⁸ Division of Medical Physics, Stellenbosch University, Tygerberg Academic Hospital, Cape Town, South Africa.
⁹ Division of Radiation Oncology, Stellenbosch University, Tygerberg Academic Hospital, Cape Town, South Africa.

PMID: 35580067
PMCID: PMC9359039
DOI: 10.1002/acm2.13647

Abstract

Purpose: To determine the most accurate similarity metric when using an independent system to verify automatically generated contours.

Methods: A reference autocontouring system (primary system to create clinical contours) and a verification autocontouring system (secondary system to test the primary contours) were used to generate a pair of 6 female pelvic structures (UteroCervix [uterus + cervix], CTVn [nodal clinical target volume (CTV)], PAN [para-aortic lymph nodes], bladder, rectum, and kidneys) on 49 CT scans from our institution and 38 from other institutions. Additionally, clinically acceptable and unacceptable contours were manually generated using the 49 internal CT scans. Eleven similarity metrics (volumetric Dice similarity coefficient (DSC), Hausdorff distance, 95% Hausdorff distance, mean surface distance, and surface DSC with tolerances from 1 to 10 mm) were calculated between the reference and the verification autocontours, and between the manually generated and the verification autocontours. A support vector machine (SVM) was used to determine the threshold that separates clinically acceptable and unacceptable contours for each structure. The 11 metrics were investigated individually and in certain combinations. Linear, radial basis function, sigmoid, and polynomial kernels were tested using the combinations of metrics as inputs for the SVM.

Results: The highest contouring error detection accuracies were 0.91 for the UteroCervix, 0.90 for the CTVn, 0.89 for the PAN, 0.92 for the bladder, 0.95 for the rectum, and 0.97 for the kidneys and were achieved using surface DSCs with a thickness of 1, 2, or 3 mm. The linear kernel was the most accurate and consistent when a combination of metrics was used as an input for the SVM. However, the best model accuracy from the combinations of metrics was not better than the best model accuracy from a surface DSC as an input.

Conclusions: We distinguished clinically acceptable contours from clinically unacceptable contours with an accuracy higher than 0.9 for the targets and critical structures in patients with cervical cancer; the most accurate similarity metric was surface DSC with a thickness of 1, 2, or 3 mm.

Keywords: auto-contour; deep learning; similarity metrics.

PubMed Disclaimer

Conflict of interest statement

This work was partially funded by the National Cancer Institute and Varian Medical Systems.

Figures

**FIGURE 1**
Examples of manually generated, clinically acceptable (green) and unacceptable (red) contours for the (a) UteroCervix, (b) bladder, (c) right kidney, and (d) rectum. (e) The reference autocontour (yellow) was clinically unacceptable when the verification autocontour (blue) was clinically acceptable. (f) Both the reference and the verification autocontours were clinically unacceptable

**FIGURE 2**
(a) Diagram demonstrating the data acquisition process for automatic contour QA model development and (b) demonstrating that each set was split equally into three for threefold cross‐validation. QA, quality assurance

**FIGURE 3**
Average accuracies of the contour QA model with an individual metric for each structure with various penalty parameters, C. The error bar represents ±1 standard deviation from threefold cross‐validation. QA, quality assurance

**FIGURE 4**
The ROC curves with a surface DSC with a tolerance of 2 mm, the best metric to predict the clinical acceptability of the automatically generated contours. DSC, Dice similarity coefficient

**FIGURE 5**
Average accuracies of the SVM model with multiple metrics for each structure. The error bar represents ±1 standard deviation. Four different kernels (linear, polynomial, rbf, and sigmoid) were tested. rbf, radial basis function; SVM, support vector machine

**FIGURE 6**
False positives can make the thresholds more generous (blue dashed lines) than the desired thresholds (brown dashed lines) and result in having more false negatives in clinical situations

**FIGURE 7**
The surface DSC distributions of the clinically acceptable and unacceptable kidney contours with (left) and without (right) the manually generated contours. The thresholds can be confidently determined with the manual contours, whereas the threshold can be anywhere between the blue and red dashed lines without the manual contours due to insufficient amount of data. DSC, Dice similarity coefficient

See this image and copyright information in PMC

References

1. Ford E, Conroy L, Dong L, et al. Strategies for effective physics plan and chart review in radiation therapy: report of AAPM Task Group 275. Med Phys. 2020;47(6):e236‐e272. 10.1002/mp.14030 - DOI - PMC - PubMed
1. Chen H‐C, Tan J, Dolly S, et al. Automated contouring error detection based on supervised geometric attribute distribution models for radiation therapy: a general strategy. Med Phys. 2015;42(2):1048‐1059. 10.1118/1.4906197 - DOI - PubMed
1. McIntosh C, Svistoun I, Purdie TG. Groupwise conditional random forests for automatic shape classification and contour quality assessment in radiotherapy planning. IEEE Trans Med Imaging. 2013;32(6):1043‐1057. 10.1109/TMI.2013.2251421 - DOI - PubMed
1. Hui CB, Nourzadeh H, Watkins WT, et al. Automated OAR anomaly and error detection tool in radiation therapy. Int J Radiat Oncol Biol Phys. 2017;99(2):E554‐E555. 10.1016/j.ijrobp.2017.06.1932 - DOI
1. Rhee DJ, Cardenas CE, Elhalawani H, et al. Automatic detection of contouring errors using convolutional neural networks. Med Phys. 2019;46(11):5089‐5097. 10.1002/mp.13814 - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Automatic contouring QA method using a deep learning-based autocontouring system

Affiliations

Automatic contouring QA method using a deep learning-based autocontouring system

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources