Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation
- PMID: 15250643
- PMCID: PMC1283110
- DOI: 10.1109/TMI.2004.828354
Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation
Abstract
Characterizing the performance of image segmentation approaches has been a persistent challenge. Performance analysis is important since segmentation algorithms often have limited accuracy and precision. Interactive drawing of the desired segmentation by human raters has often been the only acceptable approach, and yet suffers from intra-rater and inter-rater variability. Automated algorithms have been sought in order to remove the variability introduced by raters, but such algorithms must be assessed to ensure they are suitable for the task. The performance of raters (human or algorithmic) generating segmentations of medical images has been difficult to quantify because of the difficulty of obtaining or estimating a known true segmentation for clinical data. Although physical and digital phantoms can be constructed for which ground truth is known or readily estimated, such phantoms do not fully reflect clinical images due to the difficulty of constructing phantoms which reproduce the full range of imaging characteristics and normal and pathological anatomical variability observed in clinical data. Comparison to a collection of segmentations by raters is an attractive alternative since it can be carried out directly on the relevant clinical imaging data. However, the most appropriate measure or set of measures with which to compare such segmentations has not been clarified and several measures are used in practice. We present here an expectation-maximization algorithm for simultaneous truth and performance level estimation (STAPLE). The algorithm considers a collection of segmentations and computes a probabilistic estimate of the true segmentation and a measure of the performance level represented by each segmentation. The source of each segmentation in the collection may be an appropriately trained human rater or raters, or may be an automated segmentation algorithm. The probabilistic estimate of the true segmentation is formed by estimating an optimal combination of the segmentations, weighting each segmentation depending upon the estimated performance level, and incorporating a prior model for the spatial distribution of structures being segmented as well as spatial homogeneity constraints. STAPLE is straightforward to apply to clinical imaging data, it readily enables assessment of the performance of an automated image segmentation algorithm, and enables direct comparison of human rater and algorithm performance.
Figures
References
-
- D. Nicoll and W. Detmer, Basic Principles of Diagnostic Test Use and Interpretation. New York: McGraw-Hill, 2001, ch. 1, pp. 1–16.
-
- Styner M, Brechbühler C, Székely G, Gerig G. “Parametric estimate of intensity inhomogeneities applied to MRI,”. IEEE Trans Med Imag. 2000 Mar;19(3):153–165. - PubMed
-
- Collins D, Zijdenbos A, Kollokian V, Sled J, Kabani N, Holmes C, Evans A. “Design and construction of a realistic digital brain phantom,”. IEEE Trans Med Imag. 1998 June;17(3):463–468. - PubMed
-
- T. S. Yoo, M. J. Ackerman, and M. Vannier, “Toward a common validation methodology for segmentation and registration algorithms,” in Proc. 3rd Int. Conf. Medical Image Computing and Computer-Assisted Intervention (MICCAI 2000), A. M. DiGioia and S. Delp, Eds., 2000, pp. 422–431.
Publication types
MeSH terms
Grants and funding
- P01 CA67165/CA/NCI NIH HHS/United States
- R01 LM007861-01A1/LM/NLM NIH HHS/United States
- R33 CA99015/CA/NCI NIH HHS/United States
- R01 AG19513/AG/NIA NIH HHS/United States
- R01 NS035142/NS/NINDS NIH HHS/United States
- R01 LM007861/LM/NLM NIH HHS/United States
- R01 AG019513/AG/NIA NIH HHS/United States
- R21 CA89449/CA/NCI NIH HHS/United States
- P41 RR013218/RR/NCRR NIH HHS/United States
- R01 CA086879/CA/NCI NIH HHS/United States
- R21 MH067054/MH/NIMH NIH HHS/United States
- P41 RR13218/RR/NCRR NIH HHS/United States
- R01 NS35142/NS/NINDS NIH HHS/United States
- R21 MH67054/MH/NIMH NIH HHS/United States
- P01 CA067165/CA/NCI NIH HHS/United States
- R01 CA86879/CA/NCI NIH HHS/United States
- R01 LM007861-02/LM/NLM NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
