Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Oct;29(10):1714-29.
doi: 10.1109/TMI.2010.2050897. Epub 2010 Jun 17.

A generative model for image segmentation based on label fusion

Affiliations

A generative model for image segmentation based on label fusion

Mert R Sabuncu et al. IEEE Trans Med Imaging. 2010 Oct.

Abstract

We propose a nonparametric, probabilistic model for the automatic segmentation of medical images, given a training set of images and corresponding label maps. The resulting inference algorithms rely on pairwise registrations between the test image and individual training images. The training labels are then transferred to the test image and fused to compute the final segmentation of the test subject. Such label fusion methods have been shown to yield accurate segmentation, since the use of multiple registrations captures greater inter-subject anatomical variability and improves robustness against occasional registration failures. To the best of our knowledge, this manuscript presents the first comprehensive probabilistic framework that rigorously motivates label fusion as a segmentation approach. The proposed framework allows us to compare different label fusion algorithms theoretically and practically. In particular, recent label fusion or multiatlas segmentation algorithms are interpreted as special cases of our framework. We conduct two sets of experiments to validate the proposed methods. In the first set of experiments, we use 39 brain MRI scans-with manually segmented white matter, cerebral cortex, ventricles and subcortical structures-to compare different label fusion algorithms and the widely-used FreeSurfer whole-brain segmentation tool. Our results indicate that the proposed framework yields more accurate segmentation than FreeSurfer and previous label fusion algorithms. In a second experiment, we use brain MRI scans of 282 subjects to demonstrate that the proposed segmentation tool is sufficiently sensitive to robustly detect hippocampal volume changes in a study of aging and Alzheimer's Disease.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Graphical model that depicts the relationship between the variables. Squares indicate nonrandom parameters, circles indicate random variables. Replications are illustrated with plates (bounding L(x) and I(x)). The |Ω| in the corner of the plate indicates the variables inside are replicated that many times (i.e., once for each voxel), and thus are conditionally independent. Shaded variables are observed.
Fig. 2
Fig. 2
A typical segmentation obtained with the local mixture model. 2D slices are shown for visualization only. All computations are done in 3D.
Fig. 3
Fig. 3
Dice scores obtained using Majority Voting and various label prior models: Nearest Neighbor (red), Tri-linear (green), and LogOdds (blue). Dice scores for the two hemispheres were averaged. On each box, the central mark is the median, the edges of the box are the 25th and 75th percentiles. The whiskers extend to 2.7 standard deviations around the mean, and outliers are marked individually as a “*.”
Fig. 4
Fig. 4
Dice scores for all methods (top: left hemisphere, bottom: right hemisphere): FreeSurfer (red), Majority Voting (black), STAPLE (light green), Majority10 (dark green), Global Weighted Fusion (light blue), Local Weighted Voting (dark blue), Semi-local Weighted Fusion (purple). On each box, the central mark is the median, the edges of the box are the 25th and 75th percentiles. The whiskers extend to 2.7 standard deviations around the mean, and outliers are marked individually as a “*.”
Fig. 5
Fig. 5
Average Dice scores for each algorithm (FS: FreeSurfer, Majority: Majority Voting, STAPLE, Majority10, Global: Global Weighted Fusion, Local: Local Weighted Voting, and Semi-Local: Semi-local Weighted Fusion). Error bars show standard error. Each subject and ROI was treated as an independent sample with equal weight.
Fig. 6
Fig. 6
Average Dice differences: Semi-Local Weighted Fusion minus Local Weighted Voting. Overall, Semi-Local Weighted Fusion achieves better segmentation. Error bars show standard error.
Fig. 7
Fig. 7
The segmentations of the subject that Semi-local Weighted Fusion performed the worst on. Left to right: FreeSurfer, Global and Semi-local Weighted Fusion. Common mistakes (indicated by arrows): (A) Global Weighted Fusion tends to over-segment complex shapes like the cortex. (B) Semi-local Weighted Fusion does not encode topological information, as FreeSurfer does. Hence it may assign an “unknown” or “background” label (white) in between the pallidum (blue), putamen (pink), and white matter (green).
Fig. 8
Fig. 8
The average Dice score for Majority Voting (Majority) and Local Weighted Voting (Local) as a function of the number of training subjects. We consider two strategies to select the training subjects: (1) randomly selecting a set of training subjects (Rand), (2) selecting the best training subjects that are globally most similar to the test subject (Best). The average Dice score reaches 83.9% for Majority Voting and 87.8% for Local Weighted Voting, when all 38 subjects are used.
Fig. 9
Fig. 9
Average Dice scores for different β values in the MRF membership prior of (8). Error bars show standard error.
Fig. 10
Fig. 10
Hippocampal volume differences on the data from Experiment 1. On each box, the central mark is the median, the edges of the box are the 25th and 75th percentiles. The whiskers extend to 2.7 standard deviations around the mean. (a) Automatic minus Manual volumes. (b) Relative volume differences [(23)].
Fig. 11
Fig. 11
Age histogram of 282 subjects in Experiment 2.
Fig. 12
Fig. 12
Hippocampal volumes for five different groups in Experiment 2. Error bars indicate standard error across subjects. Stars indicate that the volume measurements in the present group are statistically significantly smaller than the measurements in the neighboring group to the left. (Unpaired, single-sided t-test. * p < 0.05, ** p < 0.01). (a) Left hippocampus. (b) Right hippocampus.

References

    1. Aljabar P, Heckemann RA, Hammers A, Hajnal JV, Rueckert D. Multi-atlas based segmentation of brain images: Atlas selection and its effect on accuracy. Neuroimage. 2009;46(3):726–738. - PubMed
    1. Allassonnière S, Amit Y, Trouve A. Towards a coherent statistical framework for dense deformable template estimation. J R Stat Soc B. 2007;69:3–29.
    1. Allassonnière S, Kuhn E, Trouvé A. MAP estimation of statistical deformable template via nonlinear mixed effect models: Deterministic and stochastic approaches. Math. Foundations Computat. Anat. (MFCA) Workshop MICCAI 2008 Conf; 2008.
    1. Arsigny V, Commowick O, Pennec X, Ayache N. Proc of MICCAI. Vol. 4190. New York: Springer; 2006. A log-Euclidean framework for statistics on diffeomorphisms; pp. 924–931. Lecture Notes Computer Science. - PubMed
    1. Artaechevarria X, Munoz-Barrutia A, de Solorzano CO. Efficient classifier generation and weighted voting for atlas-based segmentation: Two small steps faster and closer to the combination oracle. SPIE Med. Imag; 2008; 2008. p. 6914.

Publication types