Preliminary Clinical Study of the Differences Between Interobserver Evaluation and Deep Convolutional Neural Network-Based Segmentation of Multiple Organs at Risk in CT Images of Lung Cancer

Jinhan Zhu¹, Yimei Liu¹, Jun Zhang¹, Yixuan Wang¹, Lixin Chen¹

Affiliations

Affiliation

¹ State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China.

PMID: 31334129
PMCID: PMC6624788
DOI: 10.3389/fonc.2019.00627

Preliminary Clinical Study of the Differences Between Interobserver Evaluation and Deep Convolutional Neural Network-Based Segmentation of Multiple Organs at Risk in CT Images of Lung Cancer

Jinhan Zhu et al. Front Oncol. 2019.

. 2019 Jul 5:9:627.

doi: 10.3389/fonc.2019.00627. eCollection 2019.

Authors

Jinhan Zhu¹, Yimei Liu¹, Jun Zhang¹, Yixuan Wang¹, Lixin Chen¹

Affiliation

¹ State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China.

PMID: 31334129
PMCID: PMC6624788
DOI: 10.3389/fonc.2019.00627

Abstract

Background: In this study, publicly datasets with organs at risk (OAR) structures were used as reference data to compare the differences of several observers. Convolutional neural network (CNN)-based auto-contouring was also used in the analysis. We evaluated the variations among observers and the effect of CNN-based auto-contouring in clinical applications. Materials and methods: A total of 60 publicly available lung cancer CT with structures were used; 48 cases were used for training, and the other 12 cases were used for testing. The structures of the datasets were used as reference data. Three observers and a CNN-based program performed contouring for 12 testing cases, and the 3D dice similarity coefficient (DSC) and mean surface distance (MSD) were used to evaluate differences from the reference data. The three observers edited the CNN-based contours, and the results were compared to those of manual contouring. A value of P<0.05 was considered statistically significant. Results: Compared to the reference data, no statistically significant differences were observed for the DSCs and MSDs among the manual contouring performed by the three observers at the same institution for the heart, esophagus, spinal cord, and left and right lungs. The 95% confidence interval (CI) and P-values of the CNN-based auto-contouring results comparing to the manual results for the heart, esophagus, spinal cord, and left and right lungs were as follows: the DSCs were CNN vs. A: 0.914~0.939(P = 0.004), 0.746~0.808(P = 0.002), 0.866~0.887(P = 0.136), 0.952~0.966(P = 0.158) and 0.960~0.972 (P = 0.136); CNN vs. B: 0.913~0.936 (P = 0.002), 0.745~0.807 (P = 0.005), 0.864~0.894 (P = 0.239), 0.952~0.964 (P = 0.308), and 0.959~0.971 (P = 0.272); and CNN vs. C: 0.912~0.933 (P = 0.004), 0.748~0.804(P = 0.002), 0.867~0.890 (P = 0.530), 0.952~0.964 (P = 0.308), and 0.958~0.970 (P = 0.480), respectively. The P-values of MSDs are similar to DSCs. The P-values of heart and esophagus is smaller than 0.05. No significant differences were found between the edited CNN-based auto-contouring results and the manual results. Conclusion: For the spinal cord, both lungs, no statistically significant differences were found between CNN-based auto-contouring and manual contouring. Further modifications to contouring of the heart and esophagus are necessary. Overall, editing based on CNN-based auto-contouring can effectively shorten the contouring time without affecting the results. CNNs have considerable potential for automatic contouring applications.

Keywords: auto-contouring; contour variation; deep convolutional neural network; lung cancer; organs at risk.

PubMed Disclaimer

References

1. Tao CJ, Yi JL, Chen NY, Ren W, Cheng J, Tung S, et al. Multi-subject atlas-based auto-segmentation reduces interobserver variation and improves dosimetric parameter consistency for organs at risk in nasopharyngeal carcinoma: A multi-institution clinical study. Radiother Oncol. (2015) 115:407–11. 10.1016/j.radonc.2015.05.012 - DOI - PubMed
1. Nelms BE, Tomé WA, Robinson G, Wheeler J. Variations in the contouring of organs at risk: test case from a patient with oropharyngeal cancer. Int J Radiation Oncol Biol Phys. (2012) 82:368–78. 10.1016/j.ijrobp.2010.10.019 - DOI - PubMed
1. Lin L, Dou Q, Jin YM, Zhou GQ, Tang YQ, Chen WL, et al. Deep learning for automated contouring of primary tumor volumes by MRI for nasopharyngeal carcinoma. Radiology. (2019) 291:677–86. 10.1148/radiol.2019182012 - DOI - PubMed
1. Chen AM, Chin R, Beron P, Yoshizaki T, Mikaeilian AG, Cao M. Inadequate target volume delineation and local-regional recurrence after intensity-modulated radiotherapy for human papillomavirus-positive oropharynx cancer. Radiother Oncol. (2017) 123:412–8. 10.1016/j.radonc.2017.04.015 - DOI - PubMed
1. Rosewall T, Bayley AJ, Chung P, Le LW, Xie J, Baxi S, et al. The effect of delineation method and observer variability on bladder dose-volume histograms for prostate intensity modulated radiotherapy. Radiother Oncol. (2011) 101:479–85. 10.1016/j.radonc.2011.06.039 - DOI - PubMed

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Preliminary Clinical Study of the Differences Between Interobserver Evaluation and Deep Convolutional Neural Network-Based Segmentation of Multiple Organs at Risk in CT Images of Lung Cancer

Affiliation

Preliminary Clinical Study of the Differences Between Interobserver Evaluation and Deep Convolutional Neural Network-Based Segmentation of Multiple Organs at Risk in CT Images of Lung Cancer

Authors

Affiliation

Abstract

References

LinkOut - more resources

Full Text Sources