A slice classification model-facilitated 3D encoder-decoder network for segmenting organs at risk in head and neck cancer

Shuming Zhang¹, Hao Wang¹, Suqing Tian¹, Xuyang Zhang^{1

2}, Jiaqi Li^{1

3}, Runhong Lei¹, Mingze Gao⁴, Chunlei Liu⁴, Li Yang⁴, Xinfang Bi⁴, Linlin Zhu⁴, Senhua Zhu⁴, Ting Xu⁵, Ruijie Yang¹

Affiliations

¹ Department of Radiation Oncology, Peking University Third Hospital, Beijing, China.
² Cancer Center, Beijing Luhe Hospital, Capital Medical University, Beijing, China.
³ Department of Emergency, Beijing Children's Hospital, Capital Medical University, Beijing, China.
⁴ Beijing Linking Medical Technology Co., Ltd, Beijing, China.
⁵ Institute of Science and Technology Development, Beijing University of Posts and Telecommunications, Beijing, China.

PMID: 33029634
PMCID: PMC7779351
DOI: 10.1093/jrr/rraa094

A slice classification model-facilitated 3D encoder-decoder network for segmenting organs at risk in head and neck cancer

Shuming Zhang et al. J Radiat Res. 2021.

. 2021 Jan 1;62(1):94-103.

doi: 10.1093/jrr/rraa094.

Authors

Affiliations

¹ Department of Radiation Oncology, Peking University Third Hospital, Beijing, China.
² Cancer Center, Beijing Luhe Hospital, Capital Medical University, Beijing, China.
³ Department of Emergency, Beijing Children's Hospital, Capital Medical University, Beijing, China.
⁴ Beijing Linking Medical Technology Co., Ltd, Beijing, China.
⁵ Institute of Science and Technology Development, Beijing University of Posts and Telecommunications, Beijing, China.

PMID: 33029634
PMCID: PMC7779351
DOI: 10.1093/jrr/rraa094

Abstract

For deep learning networks used to segment organs at risk (OARs) in head and neck (H&N) cancers, the class-imbalance problem between small volume OARs and whole computed tomography (CT) images results in delineation with serious false-positives on irrelevant slices and unnecessary time-consuming calculations. To alleviate this problem, a slice classification model-facilitated 3D encoder-decoder network was developed and validated. In the developed two-step segmentation model, a slice classification model was firstly utilized to classify CT slices into six categories in the craniocaudal direction. Then the target categories for different OARs were pushed to the different 3D encoder-decoder segmentation networks, respectively. All the patients were divided into training (n = 120), validation (n = 30) and testing (n = 20) datasets. The average accuracy of the slice classification model was 95.99%. The Dice similarity coefficient and 95% Hausdorff distance, respectively, for each OAR were as follows: right eye (0.88 ± 0.03 and 1.57 ± 0.92 mm), left eye (0.89 ± 0.03 and 1.35 ± 0.43 mm), right optic nerve (0.72 ± 0.09 and 1.79 ± 1.01 mm), left optic nerve (0.73 ± 0.09 and 1.60 ± 0.71 mm), brainstem (0.87 ± 0.04 and 2.28 ± 0.99 mm), right temporal lobe (0.81 ± 0.12 and 3.28 ± 2.27 mm), left temporal lobe (0.82 ± 0.09 and 3.73 ± 2.08 mm), right temporomandibular joint (0.70 ± 0.13 and 1.79 ± 0.79 mm), left temporomandibular joint (0.70 ± 0.16 and 1.98 ± 1.48 mm), mandible (0.89 ± 0.02 and 1.66 ± 0.51 mm), right parotid (0.77 ± 0.07 and 7.30 ± 4.19 mm) and left parotid (0.71 ± 0.12 and 8.41 ± 4.84 mm). The total segmentation time was 40.13 s. The 3D encoder-decoder network facilitated by the slice classification model demonstrated superior performance in accuracy and efficiency in segmenting OARs in H&N CT images. This may significantly reduce the workload for radiation oncologists.

Keywords: automatic segmentation; deep learning; head and neck; organs at risk; radiotherapy.

PubMed Disclaimer

Figures

**Fig. 1.**
Illustration of the categories in the slice classification model. The boundary between categories I and II was the first slice of the skull; the boundary between categories II and III was the first slice of the eyes; the boundary between categories III and IV was the last slice of the eyes; the boundary between categories IV and V was the last slice of the cerebellum; and the boundary between categories V and VI was the last slice of the mandible.

**Fig. 2.**
The architecture of the slice classification model. It mainly consisted of separable convolution, max-pooling and global average pooling modules. The concatenations between two separable convolutions were used to merge the front and rear feature maps to extract more features.

**Fig. 3.**
The architecture of the refined 3D encoder–decoder network. (A) The refined 3D encoder–decoder network was constructed with a down-sampling path and an up-sampling path in which the dilated convolution stacks were used to extract image features. (B) The dilated convolution stack consisted of four convolution modules with different dilation rates (rate = 1, 2, 3, 4).

**Fig. 4.**
The number of slices categorized incorrectly for each category at the boundary for each testing patient. The positive value (+n) indicated that there were n redundant slices in this category at the boundary, and the negative value (−m) indicated that there were m missed slices in this category at the boundary.

**Fig. 5.**
The average number of CT slices that was pushed/not pushed to the segmentation model of two-step segmentation model for the testing data. The white area (first number) indicated the number of CT slices that was not pushed to the segmentation model of two-step segmentation model. The gray area (second number) indicated the number of CT slices that was pushed to the segmentation model of two-step segmentation model.

**Fig. 6.**
Visual comparison of auto segmentation and ground truth of the brainstem, eyes, optic nerves, temporal lobes, mandible and parotids. The blue and green lines denote the delineation results generated by the two-step segmentation model and segmentation-only model, respectively. The purple lines indicate the ground truth labeled by experienced radiation oncologists.

See this image and copyright information in PMC

References

1. Wang X, Eisbruch A. IMRT for head and neck cancer: Reducing xerostomia and dysphagia. J Radiat Res 2016;57:i69–75. - PMC - PubMed
1. Hawkins PG, Kadam AS, Jackson WC et al. Organ-sparing in radiotherapy for head-and-neck cancer: Improving quality of life. Semin Radiat Oncol 2018;28:46–52. - PubMed
1. Harari PM, Song S, Tomé WA. Emphasizing conformal avoidance versus target definition for IMRT planning in head-and-neck cancer. Int J Radiat Oncol Biol Phys 2010;77:950–8. - PMC - PubMed
1. Lorenzen EL, Taylor CW, Maraldo M et al. Inter-observer variation in delineation of the heart and left anterior descending coronary artery in radiotherapy for breast cancer: A multi-Centre study from Denmark and the UK. Radiother Oncol 2013;108:254–8. - PubMed
1. Sharp G, Fritscher KD, Pekar V et al. Vision 20/20: Perspectives on automated image segmentation for radiotherapy. Med Phys 2014;41:050902. - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A slice classification model-facilitated 3D encoder-decoder network for segmenting organs at risk in head and neck cancer

Affiliations

A slice classification model-facilitated 3D encoder-decoder network for segmenting organs at risk in head and neck cancer

Authors

Affiliations

Abstract

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical