Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jan 1;62(1):94-103.
doi: 10.1093/jrr/rraa094.

A slice classification model-facilitated 3D encoder-decoder network for segmenting organs at risk in head and neck cancer

Affiliations

A slice classification model-facilitated 3D encoder-decoder network for segmenting organs at risk in head and neck cancer

Shuming Zhang et al. J Radiat Res. .

Abstract

For deep learning networks used to segment organs at risk (OARs) in head and neck (H&N) cancers, the class-imbalance problem between small volume OARs and whole computed tomography (CT) images results in delineation with serious false-positives on irrelevant slices and unnecessary time-consuming calculations. To alleviate this problem, a slice classification model-facilitated 3D encoder-decoder network was developed and validated. In the developed two-step segmentation model, a slice classification model was firstly utilized to classify CT slices into six categories in the craniocaudal direction. Then the target categories for different OARs were pushed to the different 3D encoder-decoder segmentation networks, respectively. All the patients were divided into training (n = 120), validation (n = 30) and testing (n = 20) datasets. The average accuracy of the slice classification model was 95.99%. The Dice similarity coefficient and 95% Hausdorff distance, respectively, for each OAR were as follows: right eye (0.88 ± 0.03 and 1.57 ± 0.92 mm), left eye (0.89 ± 0.03 and 1.35 ± 0.43 mm), right optic nerve (0.72 ± 0.09 and 1.79 ± 1.01 mm), left optic nerve (0.73 ± 0.09 and 1.60 ± 0.71 mm), brainstem (0.87 ± 0.04 and 2.28 ± 0.99 mm), right temporal lobe (0.81 ± 0.12 and 3.28 ± 2.27 mm), left temporal lobe (0.82 ± 0.09 and 3.73 ± 2.08 mm), right temporomandibular joint (0.70 ± 0.13 and 1.79 ± 0.79 mm), left temporomandibular joint (0.70 ± 0.16 and 1.98 ± 1.48 mm), mandible (0.89 ± 0.02 and 1.66 ± 0.51 mm), right parotid (0.77 ± 0.07 and 7.30 ± 4.19 mm) and left parotid (0.71 ± 0.12 and 8.41 ± 4.84 mm). The total segmentation time was 40.13 s. The 3D encoder-decoder network facilitated by the slice classification model demonstrated superior performance in accuracy and efficiency in segmenting OARs in H&N CT images. This may significantly reduce the workload for radiation oncologists.

Keywords: automatic segmentation; deep learning; head and neck; organs at risk; radiotherapy.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Illustration of the categories in the slice classification model. The boundary between categories I and II was the first slice of the skull; the boundary between categories II and III was the first slice of the eyes; the boundary between categories III and IV was the last slice of the eyes; the boundary between categories IV and V was the last slice of the cerebellum; and the boundary between categories V and VI was the last slice of the mandible.
Fig. 2.
Fig. 2.
The architecture of the slice classification model. It mainly consisted of separable convolution, max-pooling and global average pooling modules. The concatenations between two separable convolutions were used to merge the front and rear feature maps to extract more features.
Fig. 3.
Fig. 3.
The architecture of the refined 3D encoder–decoder network. (A) The refined 3D encoder–decoder network was constructed with a down-sampling path and an up-sampling path in which the dilated convolution stacks were used to extract image features. (B) The dilated convolution stack consisted of four convolution modules with different dilation rates (rate = 1, 2, 3, 4).
Fig. 4.
Fig. 4.
The number of slices categorized incorrectly for each category at the boundary for each testing patient. The positive value (+n) indicated that there were n redundant slices in this category at the boundary, and the negative value (−m) indicated that there were m missed slices in this category at the boundary.
Fig. 5.
Fig. 5.
The average number of CT slices that was pushed/not pushed to the segmentation model of two-step segmentation model for the testing data. The white area (first number) indicated the number of CT slices that was not pushed to the segmentation model of two-step segmentation model. The gray area (second number) indicated the number of CT slices that was pushed to the segmentation model of two-step segmentation model.
Fig. 6.
Fig. 6.
Visual comparison of auto segmentation and ground truth of the brainstem, eyes, optic nerves, temporal lobes, mandible and parotids. The blue and green lines denote the delineation results generated by the two-step segmentation model and segmentation-only model, respectively. The purple lines indicate the ground truth labeled by experienced radiation oncologists.

References

    1. Wang X, Eisbruch A. IMRT for head and neck cancer: Reducing xerostomia and dysphagia. J Radiat Res 2016;57:i69–75. - PMC - PubMed
    1. Hawkins PG, Kadam AS, Jackson WC et al. Organ-sparing in radiotherapy for head-and-neck cancer: Improving quality of life. Semin Radiat Oncol 2018;28:46–52. - PubMed
    1. Harari PM, Song S, Tomé WA. Emphasizing conformal avoidance versus target definition for IMRT planning in head-and-neck cancer. Int J Radiat Oncol Biol Phys 2010;77:950–8. - PMC - PubMed
    1. Lorenzen EL, Taylor CW, Maraldo M et al. Inter-observer variation in delineation of the heart and left anterior descending coronary artery in radiotherapy for breast cancer: A multi-Centre study from Denmark and the UK. Radiother Oncol 2013;108:254–8. - PubMed
    1. Sharp G, Fritscher KD, Pekar V et al. Vision 20/20: Perspectives on automated image segmentation for radiotherapy. Med Phys 2014;41:050902. - PMC - PubMed

MeSH terms