Comparative Study

. 2006 Feb;27(2):99-113.

doi: 10.1002/hbm.20161.

Quantitative evaluation of automated skull-stripping methods applied to contemporary and legacy images: effects of diagnosis, bias correction, and slice location

Christine Fennema-Notestine¹, I Burak Ozyurt, Camellia P Clark, Shaunna Morris, Amanda Bischoff-Grethe, Mark W Bondi, Terry L Jernigan, Bruce Fischl, Florent Segonne, David W Shattuck, Richard M Leahy, David E Rex, Arthur W Toga, Kelly H Zou, Gregory G Brown

Affiliations

Affiliation

¹ Laboratory of Cognitive Imaging, Department of Psychiatry, University of California, San Diego, and Veterans Affairs SAn Diego Healthcare System, San Diego, La Jolla, California 92093, USA.

PMID: 15986433
PMCID: PMC2408865
DOI: 10.1002/hbm.20161

Comparative Study

Quantitative evaluation of automated skull-stripping methods applied to contemporary and legacy images: effects of diagnosis, bias correction, and slice location

Christine Fennema-Notestine et al. Hum Brain Mapp. 2006 Feb.

. 2006 Feb;27(2):99-113.

doi: 10.1002/hbm.20161.

Authors

Affiliation

¹ Laboratory of Cognitive Imaging, Department of Psychiatry, University of California, San Diego, and Veterans Affairs SAn Diego Healthcare System, San Diego, La Jolla, California 92093, USA.

PMID: 15986433
PMCID: PMC2408865
DOI: 10.1002/hbm.20161

Abstract
in English, French

Performance of automated methods to isolate brain from nonbrain tissues in magnetic resonance (MR) structural images may be influenced by MR signal inhomogeneities, type of MR image set, regional anatomy, and age and diagnosis of subjects studied. The present study compared the performance of four methods: Brain Extraction Tool (BET; Smith [2002]: Hum Brain Mapp 17:143-155); 3dIntracranial (Ward [1999] Milwaukee: Biophysics Research Institute, Medical College of Wisconsin; in AFNI); a Hybrid Watershed algorithm (HWA, Segonne et al. [2004] Neuroimage 22:1060-1075; in FreeSurfer); and Brain Surface Extractor (BSE, Sandor and Leahy [1997] IEEE Trans Med Imag 16:41-54; Shattuck et al. [2001] Neuroimage 13:856-876) to manually stripped images. The methods were applied to uncorrected and bias-corrected datasets; Legacy and Contemporary T1-weighted image sets; and four diagnostic groups (depressed, Alzheimer's, young and elderly control). To provide a criterion for outcome assessment, two experts manually stripped six sagittal sections for each dataset in locations where brain and nonbrain tissue are difficult to distinguish. Methods were compared on Jaccard similarity coefficients, Hausdorff distances, and an Expectation-Maximization algorithm. Methods tended to perform better on contemporary datasets; bias correction did not significantly improve method performance. Mesial sections were most difficult for all methods. Although AD image sets were most difficult to strip, HWA and BSE were more robust across diagnostic groups compared with 3dIntracranial and BET. With respect to specificity, BSE tended to perform best across all groups, whereas HWA was more sensitive than other methods. The results of this study may direct users towards a method appropriate to their T1-weighted datasets and improve the efficiency of processing for large, multisite neuroimaging studies.

Performance of automated methods to isolate brain from nonbrain tissues in magnetic resonance (MR) structural images may be influenced by MR signal inhomogeneities, type of MR image set, regional anatomy, and age and diagnosis of subjects studied. The present study compared the performance of four methods: Brain Extraction Tool (BET; Smith [2002]: Hum Brain Mapp 17:143–155); 3dIntracranial (Ward [1999] Milwaukee: Biophysics Research Institute, Medical College of Wisconsin; in AFNI); a Hybrid Watershed algorithm (HWA, Segonne et al. [2004] Neuroimage 22:1060–1075; in FreeSurfer); and Brain Surface Extractor (BSE, Sandor and Leahy [1997] IEEE Trans Med Imag 16:41–54; Shattuck et al. [2001] Neuroimage 13:856–876) to manually stripped images. The methods were applied to uncorrected and bias‐corrected datasets; Legacy and Contemporary T₁‐weighted image sets; and four diagnostic groups (depressed, Alzheimer's, young and elderly control). To provide a criterion for outcome assessment, two experts manually stripped six sagittal sections for each dataset in locations where brain and nonbrain tissue are difficult to distinguish. Methods were compared on Jaccard similarity coefficients, Hausdorff distances, and an Expectation‐Maximization algorithm. Methods tended to perform better on contemporary datasets; bias correction did not significantly improve method performance. Mesial sections were most difficult for all methods. Although AD image sets were most difficult to strip, HWA and BSE were more robust across diagnostic groups compared with 3dIntracranial and BET. With respect to specificity, BSE tended to perform best across all groups, whereas HWA was more sensitive than other methods. The results of this study may direct users towards a method appropriate to their T₁‐weighted datasets and improve the efficiency of processing for large, multisite neuroimaging studies. Hum. Brain Mapping, 2005. © 2005 Wiley‐Liss, Inc.

PubMed Disclaimer

Figures

**Figure 1**
Standard location of the six sagittal, manually stripped slices as demonstrated on a coronal image. The six sagittal slices represent the criterion dataset; three slices from each hemisphere in symmetrical locations passing through regions that are difficult to skull‐strip. Slices are numbered for reference. Three sample images are presented in the sagittal plane. Letters represent difficult regions as referenced in the text.

**Figure 2**
Examples of automatically stripped volumes of a bias corrected, Contemporary YNC dataset. Sagittal sections are taken near the midline to represent extent of CSF and nonbrain tissue included in the resulting volumes.

**Figure 3**
Examples of automatically stripped volumes of a bias corrected, Legacy YNC dataset. Sagittal sections are lateral to the midline and represent the extent of brain tissue retained or excluded from the resulting volumes.

**Figure 4**
Examples of outcomes for a bias corrected, Contemporary ENC dataset. Each pair of figures includes solid color overlays on the stripped image (left) and the contours of these shapes (right). Left, Yellow = regions included in the manual but not in the automatic outcome. Blue = regions included in the automatic but not in the manual outcome. Right, Yellow = contour of manually‐stripped dataset. Red = contour of automatically stripped dataset.

**Figure 5**
Mean (std. error bars) Jaccard similarity coefficient (JSC) for Diagnostic Group by Method relative to the manually stripped slices from Anatomist 1. Mean JSC for the two manual raters (0.938) is represented by the horizontal dashed black line.

**Figure 6**
Mean (std. error bars) Hausdorff distance for Diagnostic Group by Method relative to the manually stripped slices from Anatomist 1. Mean Hausdorff distance for the two manual raters (5.5) is represented by the horizontal dashed black line.

**Figure 7**
Mean Specificity from the Expectation‐Maximization (EM) analysis by Diagnostic Group for each Method.

**Figure 8**
Mean Sensitivity from the Expectation‐Maximization (EM) analysis by Diagnostic Group for each Method.

See this image and copyright information in PMC

References

1. Arnold JB, Liow JS, Schaper KA, Stern JJ, Sled JG, Shattuck DW, Worth AJ, Cohen MS, Leahy RM, Mazziotta JC, et al. 2001. Qualitative and quantitative evaluation of six algorithms for correcting intensity nonuniformity effects. Neuroimage 13: 931–943. - PubMed
1. Boesen K, Rehm K, Schaper K, Stoltzner S, Woods R, Luders E, Rottenberg D. 2004. Quantitative comparison of four brain extraction algorithms. Neuroimage 22: 1255–1261. - PubMed
1. Cox RW. 1996. AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Comput Biomed Res 29: 162–173. - PubMed
1. Dale AM, Fischl B, Sereno MI. 1999. Cortical surface‐based analysis. I. Segmentation and surface reconstruction. Neuroimage 9: 179–194. - PubMed
1. DeCarli C, Maisog J, Murphy DG, Teichberg D, Rapoport SI, Horwitz B. 1992. Method for quantification of brain, ventricular, and subarachnoid CSF volumes from MR images. J Comput Assist Tomogr 16: 274–284. - PubMed

Publication types

Actions
Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Quantitative evaluation of automated skull-stripping methods applied to contemporary and legacy images: effects of diagnosis, bias correction, and slice location

Affiliation

Quantitative evaluation of automated skull-stripping methods applied to contemporary and legacy images: effects of diagnosis, bias correction, and slice location

Authors

Affiliation

Abstract
in English, French

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Abstract in English, French

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Abstract
in English, French