Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2006 Feb;27(2):99-113.
doi: 10.1002/hbm.20161.

Quantitative evaluation of automated skull-stripping methods applied to contemporary and legacy images: effects of diagnosis, bias correction, and slice location

Affiliations
Comparative Study

Quantitative evaluation of automated skull-stripping methods applied to contemporary and legacy images: effects of diagnosis, bias correction, and slice location

Christine Fennema-Notestine et al. Hum Brain Mapp. 2006 Feb.

Abstract

Performance of automated methods to isolate brain from nonbrain tissues in magnetic resonance (MR) structural images may be influenced by MR signal inhomogeneities, type of MR image set, regional anatomy, and age and diagnosis of subjects studied. The present study compared the performance of four methods: Brain Extraction Tool (BET; Smith [2002]: Hum Brain Mapp 17:143-155); 3dIntracranial (Ward [1999] Milwaukee: Biophysics Research Institute, Medical College of Wisconsin; in AFNI); a Hybrid Watershed algorithm (HWA, Segonne et al. [2004] Neuroimage 22:1060-1075; in FreeSurfer); and Brain Surface Extractor (BSE, Sandor and Leahy [1997] IEEE Trans Med Imag 16:41-54; Shattuck et al. [2001] Neuroimage 13:856-876) to manually stripped images. The methods were applied to uncorrected and bias-corrected datasets; Legacy and Contemporary T1-weighted image sets; and four diagnostic groups (depressed, Alzheimer's, young and elderly control). To provide a criterion for outcome assessment, two experts manually stripped six sagittal sections for each dataset in locations where brain and nonbrain tissue are difficult to distinguish. Methods were compared on Jaccard similarity coefficients, Hausdorff distances, and an Expectation-Maximization algorithm. Methods tended to perform better on contemporary datasets; bias correction did not significantly improve method performance. Mesial sections were most difficult for all methods. Although AD image sets were most difficult to strip, HWA and BSE were more robust across diagnostic groups compared with 3dIntracranial and BET. With respect to specificity, BSE tended to perform best across all groups, whereas HWA was more sensitive than other methods. The results of this study may direct users towards a method appropriate to their T1-weighted datasets and improve the efficiency of processing for large, multisite neuroimaging studies.

Performance of automated methods to isolate brain from nonbrain tissues in magnetic resonance (MR) structural images may be influenced by MR signal inhomogeneities, type of MR image set, regional anatomy, and age and diagnosis of subjects studied. The present study compared the performance of four methods: Brain Extraction Tool (BET; Smith [2002]: Hum Brain Mapp 17:143–155); 3dIntracranial (Ward [1999] Milwaukee: Biophysics Research Institute, Medical College of Wisconsin; in AFNI); a Hybrid Watershed algorithm (HWA, Segonne et al. [2004] Neuroimage 22:1060–1075; in FreeSurfer); and Brain Surface Extractor (BSE, Sandor and Leahy [1997] IEEE Trans Med Imag 16:41–54; Shattuck et al. [2001] Neuroimage 13:856–876) to manually stripped images. The methods were applied to uncorrected and bias‐corrected datasets; Legacy and Contemporary T1‐weighted image sets; and four diagnostic groups (depressed, Alzheimer's, young and elderly control). To provide a criterion for outcome assessment, two experts manually stripped six sagittal sections for each dataset in locations where brain and nonbrain tissue are difficult to distinguish. Methods were compared on Jaccard similarity coefficients, Hausdorff distances, and an Expectation‐Maximization algorithm. Methods tended to perform better on contemporary datasets; bias correction did not significantly improve method performance. Mesial sections were most difficult for all methods. Although AD image sets were most difficult to strip, HWA and BSE were more robust across diagnostic groups compared with 3dIntracranial and BET. With respect to specificity, BSE tended to perform best across all groups, whereas HWA was more sensitive than other methods. The results of this study may direct users towards a method appropriate to their T1‐weighted datasets and improve the efficiency of processing for large, multisite neuroimaging studies. Hum. Brain Mapping, 2005. © 2005 Wiley‐Liss, Inc.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Standard location of the six sagittal, manually stripped slices as demonstrated on a coronal image. The six sagittal slices represent the criterion dataset; three slices from each hemisphere in symmetrical locations passing through regions that are difficult to skull‐strip. Slices are numbered for reference. Three sample images are presented in the sagittal plane. Letters represent difficult regions as referenced in the text.
Figure 2
Figure 2
Examples of automatically stripped volumes of a bias corrected, Contemporary YNC dataset. Sagittal sections are taken near the midline to represent extent of CSF and nonbrain tissue included in the resulting volumes.
Figure 3
Figure 3
Examples of automatically stripped volumes of a bias corrected, Legacy YNC dataset. Sagittal sections are lateral to the midline and represent the extent of brain tissue retained or excluded from the resulting volumes.
Figure 4
Figure 4
Examples of outcomes for a bias corrected, Contemporary ENC dataset. Each pair of figures includes solid color overlays on the stripped image (left) and the contours of these shapes (right). Left, Yellow = regions included in the manual but not in the automatic outcome. Blue = regions included in the automatic but not in the manual outcome. Right, Yellow = contour of manually‐stripped dataset. Red = contour of automatically stripped dataset.
Figure 5
Figure 5
Mean (std. error bars) Jaccard similarity coefficient (JSC) for Diagnostic Group by Method relative to the manually stripped slices from Anatomist 1. Mean JSC for the two manual raters (0.938) is represented by the horizontal dashed black line.
Figure 6
Figure 6
Mean (std. error bars) Hausdorff distance for Diagnostic Group by Method relative to the manually stripped slices from Anatomist 1. Mean Hausdorff distance for the two manual raters (5.5) is represented by the horizontal dashed black line.
Figure 7
Figure 7
Mean Specificity from the Expectation‐Maximization (EM) analysis by Diagnostic Group for each Method.
Figure 8
Figure 8
Mean Sensitivity from the Expectation‐Maximization (EM) analysis by Diagnostic Group for each Method.

References

    1. Arnold JB, Liow JS, Schaper KA, Stern JJ, Sled JG, Shattuck DW, Worth AJ, Cohen MS, Leahy RM, Mazziotta JC, et al. 2001. Qualitative and quantitative evaluation of six algorithms for correcting intensity nonuniformity effects. Neuroimage 13: 931–943. - PubMed
    1. Boesen K, Rehm K, Schaper K, Stoltzner S, Woods R, Luders E, Rottenberg D. 2004. Quantitative comparison of four brain extraction algorithms. Neuroimage 22: 1255–1261. - PubMed
    1. Cox RW. 1996. AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Comput Biomed Res 29: 162–173. - PubMed
    1. Dale AM, Fischl B, Sereno MI. 1999. Cortical surface‐based analysis. I. Segmentation and surface reconstruction. Neuroimage 9: 179–194. - PubMed
    1. DeCarli C, Maisog J, Murphy DG, Teichberg D, Rapoport SI, Horwitz B. 1992. Method for quantification of brain, ventricular, and subarachnoid CSF volumes from MR images. J Comput Assist Tomogr 16: 274–284. - PubMed

Publication types