Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Nov;46(11):5086-5097.
doi: 10.1002/mp.13814. Epub 2019 Sep 26.

Automatic detection of contouring errors using convolutional neural networks

Affiliations

Automatic detection of contouring errors using convolutional neural networks

Dong Joo Rhee et al. Med Phys. 2019 Nov.

Abstract

Purpose: To develop a head and neck normal structures autocontouring tool that could be used to automatically detect the errors in autocontours from a clinically validated autocontouring tool.

Methods: An autocontouring tool based on convolutional neural networks (CNN) was developed for 16 normal structures of the head and neck and tested to identify the contour errors from a clinically validated multiatlas-based autocontouring system (MACS). The computed tomography (CT) scans and clinical contours from 3495 patients were semiautomatically curated and used to train and validate the CNN-based autocontouring tool. The final accuracy of the tool was evaluated by calculating the Sørensen-Dice similarity coefficients (DSC) and Hausdorff distances between the automatically generated contours and physician-drawn contours on 174 internal and 24 external CT scans. Lastly, the CNN-based tool was evaluated on 60 patients' CT scans to investigate the possibility to detect contouring failures. The contouring failures on these patients were classified as either minor or major errors. The criteria to detect contouring errors were determined by analyzing the DSC between the CNN- and MACS-based contours under two independent scenarios: (a) contours with minor errors are clinically acceptable and (b) contours with minor errors are clinically unacceptable.

Results: The average DSC and Hausdorff distance of our CNN-based tool was 98.4%/1.23 cm for brain, 89.1%/0.42 cm for eyes, 86.8%/1.28 cm for mandible, 86.4%/0.88 cm for brainstem, 83.4%/0.71 cm for spinal cord, 82.7%/1.37 cm for parotids, 80.7%/1.08 cm for esophagus, 71.7%/0.39 cm for lenses, 68.6%/0.72 for optic nerves, 66.4%/0.46 cm for cochleas, and 40.7%/0.96 cm for optic chiasm. With the error detection tool, the proportions of the clinically unacceptable MACS contours that were correctly detected were 0.99/0.80 on average except for the optic chiasm, when contours with minor errors are clinically acceptable/unacceptable, respectively. The proportions of the clinically acceptable MACS contours that were correctly detected were 0.81/0.60 on average except for the optic chiasm, when contours with minor errors are clinically acceptable/unacceptable, respectively.

Conclusion: Our CNN-based autocontouring tool performed well on both the publically available and the internal datasets. Furthermore, our results show that CNN-based algorithms are able to identify ill-defined contours from a clinically validated and used multiatlas-based autocontouring tool. Therefore, our CNN-based tool can effectively perform automatic verification of MACS contours.

Keywords: autocontouring; contouring QA; convolutional neural network; deep learning; head and neck.

PubMed Disclaimer

Conflict of interest statement

This work was partially funded by the National Cancer Institution and Varian.

Figures

Figure 1
Figure 1
(a) The semiautomated method for data curation. When the Dice similarity coefficients was lower than required by the criteria, the original contour was reviewed manually. (b) Example of a structure labeled incorrectly (the right eye labeled as left eye). (c) Error in which the teeth were included as part of the mandible. [Color figure can be viewed at http://wileyonlinelibrary.com]
Figure 2
Figure 2
Application of the convolutional neural network‐based classification model to a computed tomography (CT) scan. The presence or absence of mandible on each CT slice was evaluated as shown in (a), and once the evaluation was done for every structure on all CT slices, the range of each structure in the longitudinal axis was selected for each patient as shown in (b). [Color figure can be viewed at http://wileyonlinelibrary.com]
Figure 3
Figure 3
Receiver operating characteristic curves generated for each structure with 40 patients. 95% confidence interval (CI) for area under the curves were derived with the bootstrapping method. Dice similarity coefficients thresholds were derived and presented for both minor contouring errors acceptable and unacceptable scenarios. [Color figure can be viewed at http://wileyonlinelibrary.com]
Figure 4
Figure 4
Multiatlas‐based autocontouring system mandible contours with errors were tested with the error detection system. Two consecutive slices around the most erroneous region were presented for each case, and the results were (a) major error detected, (b) minor error detected, and (c) minor error not detected. [Color figure can be viewed at http://wileyonlinelibrary.com]
Figure 5
Figure 5
Physician‐drawn contours for cochleas (a), (b) and brain (c), (d). (a) and (c) were drawn by physicians at MD Anderson, and (b) and (d) were drawn by physicians working with DeepMind. [Color figure can be viewed at http://wileyonlinelibrary.com]

Similar articles

Cited by

References

    1. Vorwerk H, Zink K, Schiller R, et al. Protection of quality and innovation in radiation oncology: The prospective multicenter trial the German Society of Radiation Oncology (DEGRO‐QUIRO study). Strahlentherapie und Onkol. 2014;190:433–443. - PubMed
    1. Andrianarison VA, Laouiti M, Fargier‐Bochaton O, et al. Contouring workload in adjuvant breast cancer radiotherapy. Cancer Radiother. 2018;22:747–753. - PubMed
    1. Fiorino C, Reni M, Bolognesi A, Cattaneo GM, Calandrino R. Intra‐ and inter‐observer variability in contouring prostate and seminal vesicles: implications for conformal treatment planning. Radiother Oncol. 1998;47:285–292. - PubMed
    1. Mukesh M, Benson R, Jena R, et al. Interobserver variation in clinical target volume and organs at risk segmentation in post‐parotidectomy radiotherapy: can segmentation protocols help? Br J Radiol. 2012;85:e530–e536. - PMC - PubMed
    1. Brouwer CL, Steenbakkers RJHM, van den Heuvel E, et al. 3D Variation in delineation of head and neck organs at risk. Radiat Oncol. 2012;7:32. - PMC - PubMed