Comparative Study

. 2011 Jul 21;56(14):4557-77.

doi: 10.1088/0031-9155/56/14/021. Epub 2011 Jul 1.

Comparison of manual and automatic segmentation methods for brain structures in the presence of space-occupying lesions: a multi-expert study

M A Deeley¹, A Chen, R Datteri, J H Noble, A J Cmelak, E F Donnelly, A W Malcolm, L Moretti, J Jaboin, K Niermann, Eddy S Yang, David S Yu, F Yei, T Koyama, G X Ding, B M Dawant

Affiliations

PMID: 21725140
PMCID: PMC3153124
DOI: 10.1088/0031-9155/56/14/021

Comparative Study

Comparison of manual and automatic segmentation methods for brain structures in the presence of space-occupying lesions: a multi-expert study

M A Deeley et al. Phys Med Biol. 2011.

. 2011 Jul 21;56(14):4557-77.

doi: 10.1088/0031-9155/56/14/021. Epub 2011 Jul 1.

Authors

M A Deeley¹, A Chen, R Datteri, J H Noble, A J Cmelak, E F Donnelly, A W Malcolm, L Moretti, J Jaboin, K Niermann, Eddy S Yang, David S Yu, F Yei, T Koyama, G X Ding, B M Dawant

Affiliation

¹ Department of Radiation Oncology, Vanderbilt University, Nashville, TN, USA. matthew.deeley@uvm.edu

PMID: 21725140
PMCID: PMC3153124
DOI: 10.1088/0031-9155/56/14/021

Abstract

The purpose of this work was to characterize expert variation in segmentation of intracranial structures pertinent to radiation therapy, and to assess a registration-driven atlas-based segmentation algorithm in that context. Eight experts were recruited to segment the brainstem, optic chiasm, optic nerves, and eyes, of 20 patients who underwent therapy for large space-occupying tumors. Performance variability was assessed through three geometric measures: volume, Dice similarity coefficient, and Euclidean distance. In addition, two simulated ground truth segmentations were calculated via the simultaneous truth and performance level estimation algorithm and a novel application of probability maps. The experts and automatic system were found to generate structures of similar volume, though the experts exhibited higher variation with respect to tubular structures. No difference was found between the mean Dice similarity coefficient (DSC) of the automatic and expert delineations as a group at a 5% significance level over all cases and organs. The larger structures of the brainstem and eyes exhibited mean DSC of approximately 0.8-0.9, whereas the tubular chiasm and nerves were lower, approximately 0.4-0.5. Similarly low DSCs have been reported previously without the context of several experts and patient volumes. This study, however, provides evidence that experts are similarly challenged. The average maximum distances (maximum inside, maximum outside) from a simulated ground truth ranged from (-4.3, +5.4) mm for the automatic system to (-3.9, +7.5) mm for the experts considered as a group. Over all the structures in a rank of true positive rates at a 2 mm threshold from the simulated ground truth, the automatic system ranked second of the nine raters. This work underscores the need for large scale studies utilizing statistically robust numbers of patients and experts in evaluating quality of automatic algorithms.

PubMed Disclaimer

Figures

**Figure 1**
Atlas-based segmentation process for the brainstem and eyes. Panel (a): Orthogonal slices of a patient (top row) with a large right sided lesion and the atlas (bottom row) before registration. Panel (b): Volumes are then globally, affinely registered, and a bounded atlas region (white box) is projected onto the patient. Panel (c): Local affine and local non-rigid registration are performed on the bounded region where the top row represents the final product of the patient brainstem deformed to the atlas.

**Figure 2**
A randomly chosen patient from the 20 cases used in this study. Eight physician raters segmented the brainstem, optic chiasm, eyes, and optic nerves using a fused CT/MR image set. The automatically generated segmentations are shown in purple. The large red contour in the right parietal is the gross tumor volume.

**Figure 3**
Axial slice showing an area of high physician variability within the brainstem. In this area of the cerebellar peduncles there is little anatomical contrast, such that the physicians rely primarily on implicit knowledge. The automatic contour is represented in purple.

**Figure 4**
Volume [cm³] for the automatic (A₁), senior physician (P₁–P₄), junior physician (J₁–J₄), and simulated ground truth, STAPLE (S) and PMAP_mean (P) segmentations. The horizontal line through each box indicates the median of the volume distribution while the rectangular box represents the interquartile range. Small dots are outliers for the distribution.

**Figure 5**
Dice similarity coefficients across the 20 patients per structure to assess inter-rater performance and variance. Columns P₁–J₄ plot inter-physician comparisons: P₁–P₄ senior and J₁–J₄ junior physicians. Each distribution in these columns is comprised of pair-wise comparisons of the expert in question to each of the other experts. The automatic segmentations are included only in the first column.

**Figure 6**
Dice similarity coefficients for each rater group with respect to the simulated ground truths. The first two columns from the left compare A₁ to STAPLE (S) and PMAP_mean (P), followed by comparison with the physician group, followed by comparison between S and P in the far right column.

**Figure 7**
Distance (mm) distributions from rater segmentations to PMAP_mean. Positive distances indicate a contour point lying outside the ground truth segmentation while negative distances indicate a contour point lying within the ground truth.

**Figure 8**
True positive rate of contour points drawn within a 2 mm shell around the simulated ground truth. The abscissa is partitioned by rater and structure, the ordinate is the 2 mm true positive rate, and the whiskers represent the 95% confidence interval on the proportion.

See this image and copyright information in PMC

References

1. Babalola K, Patenaude B, Aljabar P, Schnabel J, Kennedy D, Crum W, Smith S, Cootes T, Jenkinson M, Rueckert D. An evaluation of four automatic methods of segmenting the subcortical structures in the brain. Neuroimage. 2009;4:1435–1447. - PubMed
1. Bach Caudra M, De Craene M, Duay V, Macq B, Pollo C, Thiran J. Dense deformation field estimation for atlas-based segmentation of pathological MR brain images. Compt. Methods Programs Biomed. 2006;84:66–75. - PubMed
1. Bach Cuadra M, Pollo C, Bardera A, Cuisenaire O, Villemure J, Thiran J. Atlas-based segmentation of pathological MR brain images using a model of lesion growth. IEEE Trans. Med. Imag. 2004;23:1301–1314. - PubMed
1. Biancardi A, Jirapatnakul A, Reeves A. A comparison of ground truth estimations. Int. J. Comput. Assist. Radiol. Surg. 2010;5:295–305. - PubMed
1. Bondiau P, Malandain G, Chanalet S, Marcy P, Habrand J, Fauchon F, Paquis P, Courdi A, Commowick O, Rutten I, Ayache N. Atlas-based automatic segmentation of MR images: validation study on the brainstem in radiotherapy context. Int J Radiat Oncol Biol Phys. 2005;61(1):289–298. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Comparison of manual and automatic segmentation methods for brain structures in the presence of space-occupying lesions: a multi-expert study

Affiliation

Comparison of manual and automatic segmentation methods for brain structures in the presence of space-occupying lesions: a multi-expert study

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical