Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2013 Feb 1:66:50-70.
doi: 10.1016/j.neuroimage.2012.10.081. Epub 2012 Nov 7.

A direct morphometric comparison of five labeling protocols for multi-atlas driven automatic segmentation of the hippocampus in Alzheimer's disease

Affiliations
Comparative Study

A direct morphometric comparison of five labeling protocols for multi-atlas driven automatic segmentation of the hippocampus in Alzheimer's disease

Sean M Nestor et al. Neuroimage. .

Abstract

Hippocampal volumetry derived from structural MRI is increasingly used to delineate regions of interest for functional measurements, assess efficacy in therapeutic trials of Alzheimer's disease (AD) and has been endorsed by the new AD diagnostic guidelines as a radiological marker of disease progression. Unfortunately, morphological heterogeneity in AD can prevent accurate demarcation of the hippocampus. Recent developments in automated volumetry commonly use multi-template fusion driven by expert manual labels, enabling highly accurate and reproducible segmentation in disease and healthy subjects. However, there are several protocols to define the hippocampus anatomically in vivo, and the method used to generate atlases may impact automatic accuracy and sensitivity - particularly in pathologically heterogeneous samples. Here we report a fully automated segmentation technique that provides a robust platform to directly evaluate both technical and biomarker performance in AD among anatomically unique labeling protocols. For the first time we test head-to-head the performance of five common hippocampal labeling protocols for multi-atlas based segmentation, using both the Sunnybrook Longitudinal Dementia Study and the entire Alzheimer's Disease Neuroimaging Initiative 1 (ADNI-1) baseline and 24-month dataset. We based these atlas libraries on the protocols of (Haller et al., 1997; Killiany et al., 1993; Malykhin et al., 2007; Pantel et al., 2000; Pruessner et al., 2000), and a single operator performed all manual tracings to generate de facto "ground truth" labels. All methods distinguished between normal elders, mild cognitive impairment (MCI), and AD in the expected directions, and showed comparable correlations with measures of episodic memory performance. Only more inclusive protocols distinguished between stable MCI and MCI-to-AD converters, and had slightly better associations with episodic memory. Moreover, we demonstrate that protocols including more posterior anatomy and dorsal white matter compartments furnish the best voxel-overlap accuracies (Dice Similarity Coefficient=0.87-0.89), compared to expert manual tracings, and achieve the smallest sample sizes required to power clinical trials in MCI and AD. The greatest distribution of errors was localized to the caudal hippocampus and the alveus-fimbria compartment when these regions were excluded. The definition of the medial body did not significantly alter accuracy among more comprehensive protocols. Voxel-overlap accuracies between automatic and manual labels were lower for the more pathologically heterogeneous Sunnybrook study in comparison to the ADNI-1 sample. Finally, accuracy among protocols appears to significantly differ the most in AD subjects compared to MCI and normal elders. Together, these results suggest that selection of a candidate protocol for fully automatic multi-template based segmentation in AD can influence both segmentation accuracy when compared to expert manual labels and performance as a biomarker in MCI and AD.

Keywords: Alzheimer's disease; Automatic hippocampal segmentation; Hippocampal tracing protocol; Multi-atlas.

PubMed Disclaimer

Figures

Figure 1
Figure 1
3D rendered right hippocampal volumes for protocols 1–5, of a single Sunnybrook Longitudinal Dementia Study participant with a clinical diagnosis of AD, displaying in dorsomedial orientation with anterior/head of the hippocampus (forward), medial surface (right) and superior surface (top). The top panel shows the manually labeled hippocampus whereas the bottom hippocampus corresponds to the SBHV automatically derived volume using 15 fused templates per protocol. P1=(Haller et al., 1997), P2= (Killiany et al., 1993), P3= (Malykhin et al., 2007), P4= (Pruessner et al., 2000) and P5 = (Pantel et al., 2000). Image rendered in ITK-Snap (Yushkevich et al., 2006).
Figure 2
Figure 2
Protocol-wise Bland-Altman Plots comparing manual versus SBHV automatically derived manual labels for the Sunnybrook LOOCV. An optimized protocol was used for SBHV segmentation, which fused the 15 best matching label sets in target image space.
Figure 3
Figure 3
Protocol-wise Bland-Altman plots comparing manual versus SBHV automatically derived manual labels of the right hippocampus for the ADNI-1 cross-validation study. An optimized protocol was used for SBHV segmentation, which propagated to and fused the 15 best matching template library label sets in target (query) image space.
Figure 4
Figure 4
False negative (FN) coronal distribution maps for SBHV segmentation of the posterior hippocampal region from the results of the LOOCV. The color masks represent voxel-wise FN counts (underestimation) across the five different protocols overlaid on the Sunnybrook average elderly 100-brain template. Each row of panels represent 4 serial slices from posterior (right) to anterior (left) for a given protocol. Panel row A represents the posterior border region for P1 and P3–P5 with no overlay, whereas row B shows the border region for P2 with no overlay, which is located more anterior to the other protocols. SBHV often underestimated the caudal hippocampal region across all protocols; however, the more inclusive protocols P3–P5 demonstrated less FN errors than P1 and P2, which excluded portions of the hippocampal tail.
Figure 5
Figure 5
Protocol-wise coronal false negative (FN) distribution maps for SBHV segmentation of the medial anterior-superior alveus from the results of the LOOCV. The color masks represent voxel-wise FN counts (underestimation) across the five different protocols overlaid on the Sunnybrook average elderly 100-brain template. The most offending regions are highlighted with white arrows. SBHV tended to overestimate the anterior-superior medial white matter compartment (white arrows) to a greater extent in protocols, which excluded the alveus and fimbria (i.e. P1 and P2).
Figure 6
Figure 6
Protocol-wise coronal false positive (FP) distribution maps for SBHV segmentation of the superior white matter compartment across the hippocampal body (i.e. alveus/fimbria) from the results of the LOOCV. The color masks represent voxel-wise FP counts (overestimation) across all five protocols projected onto the Sunnybrook average elderly 100-brain template. The most offending regions/protocols are highlighted with white arrows. P1 and P2, which excluded the alveus and fimbria tended to overestimate the superior white matter compartment, and this may be partially explained by the poor contrast realized between grey and white matter within this region.
Figure 7
Figure 7
Protocol-wise coronal false positive (FP) distribution maps for SBHV segmentation from the results of the LOOCV. The color masks represent voxel-wise FP counts across the five different surveyed protocols projected onto the Sunnybrook average elderly 100- brain template. White arrows highlight marked overestimation (FP errors) of the inferior hippocampal compartment, which includes background regions of parahippocampal white matter. This FP error similarly affected all protocols in the LOOCV.
Figure 8
Figure 8
False negative (FN) coronal distribution maps for SBHV segmentation of the posterior hippocampal region from the results of the ADNI-1 cross-validation study. The color masks represent voxel-wise FN counts (underestimation) across P1, P2 and P4 (rows) and NC/MCI/AD ADNI-1 groups (columns), projected onto the Sunnybrook average elderly 100-brain template. Note, the P2 posterior border started more anterior to P1 and P4. Caudal FN error distributions for the ADNI-1 validation are similar to those observed in the LOOCV. Qualitatively, caudal FN distributions between protocols varied the most in AD, and within the AD sample all protocols showed significantly different median Dice similarity measures (P4>P1>P2). Further, P2 demonstrated the greatest caudal error as a result of the landmark-based definition used to demarcate the posterior border. For within protocol comparisons, only P4 demonstrated significantly different voxel-wise accuracy measurements between groups (AD>NC).
Figure 9
Figure 9
ADNI-1 group-wise coronal false positive (FP) distribution maps for SBHV-P4 segmentation from the results of the ADNI-1 cross-validation study. White arrows highlight overestimation (FP errors) of the inferior hippocampal compartment. The color masks represent voxel-wise FP counts (overestimation) projected onto the Sunnybrook average elderly 100-brain template. Note that the FP error count was greater along the inferomedial hippocampus in NC and MCI than AD, which may partially explain the lower Dice similarity results in NC and MCI versus AD.
Figure 10
Figure 10
Protocol-specific comparisons of baseline total hippocampal volume (right + left) including ADNI-1 principal groups: Normal controls (NC), Mild Cognitive Impairment (total group) (tMCI) and Alzheimer’s Disease (AD) in addition to ADNI-1 MCI subgroups: MCI converters after 24-months (cMCI), MCI subjects who reverted back to normal elders (rMCI) and MCI subjects who remained stable after 24-months (sMCI). The whiskers represent the 10th and 90th percentiles, and all data beyond these values are plotted. P1=(Haller et al., 1997), P2= (Killiany et al., 1993), P3= (Malykhin et al., 2007), P4= (Pruessner et al., 2000) and P5 = (Pantel et al., 2000).
Figure 11
Figure 11
Protocol-specific comparisons of hippocampal 24-month rates of change normalized to baseline volume and serial scan window including ADNI-1 principal groups: Normal controls (NC), Mild Cognitive Impairment (total group) (tMCI) and Alzheimer’s Disease (AD) in addition to ADNI-1 MCI subgroups: MCI converters after 24-months (cMCI), MCI subjects who reverted back to normal elders (rMCI) and MCI subjects who remained stable after 24-months (sMCI). The whiskers represent the 10th and 90th percentiles, and all data beyond these values are plotted. P1=(Haller et al., 1997), P2= (Killiany et al., 1993), P3= (Malykhin et al., 2007), P4= (Pruessner et al., 2000) and P5 = (Pantel et al., 2000).

References

    1. Aljabar P, Heckemann RA, Hammers A, Hajnal JV, Rueckert D. Multi-atlas based segmentation of brain images: Atlas selection and its effect on accuracy. NeuroImage. 2009;46(3):726–738. - PubMed
    1. Apostolova LG, Morra JH, Green AE, Hwang KS, Avedissian C, Woo E, Cummings JL, Toga AW, Jack CR, Jr, Weiner MW, et al. Automated 3D mapping of baseline and 12-month associations between three verbal memory measures and hippocampal atrophy in 490 ADNI subjects. NeuroImage. 2010;51(1):488–499. - PMC - PubMed
    1. Avants BB, Epstein CL, Grossman M, Gee JC. Symmetric diffeomorphic image registration with cross-correlation: Evaluating automated labeling of elderly and neurodegenerative brain. Medical Image Analysis. 2008;12(1):26–41. - PMC - PubMed
    1. Barnes J, Bartlett JW, van de Pol LA, Loy CT, Scahill RI, Frost C, Thompson P, Fox NC. A meta-analysis of hippocampal atrophy rates in alzheimer's disease. Neurobiology of Aging. 2009;30(11):1711–1723. - PMC - PubMed
    1. Barnes J, Foster J, Boyes RG, Pepple T, Moore EK, Schott JM, Frost C, Scahill RI, Fox NC. A comparison of methods for the automated calculation of volumes and atrophy rates in the hippocampus. NeuroImage. 2008a;40(4):1655–1671. - PubMed

Publication types