Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019:22:101727.
doi: 10.1016/j.nicl.2019.101727. Epub 2019 Feb 22.

Inter-rater agreement in glioma segmentations on longitudinal MRI

Affiliations

Inter-rater agreement in glioma segmentations on longitudinal MRI

M Visser et al. Neuroimage Clin. 2019.

Abstract

Background: Tumor segmentation of glioma on MRI is a technique to monitor, quantify and report disease progression. Manual MRI segmentation is the gold standard but very labor intensive. At present the quality of this gold standard is not known for different stages of the disease, and prior work has mainly focused on treatment-naive glioblastoma. In this paper we studied the inter-rater agreement of manual MRI segmentation of glioblastoma and WHO grade II-III glioma for novices and experts at three stages of disease. We also studied the impact of inter-observer variation on extent of resection and growth rate.

Methods: In 20 patients with WHO grade IV glioblastoma and 20 patients with WHO grade II-III glioma (defined as non-glioblastoma) both the enhancing and non-enhancing tumor elements were segmented on MRI, using specialized software, by four novices and four experts before surgery, after surgery and at time of tumor progression. We used the generalized conformity index (GCI) and the intra-class correlation coefficient (ICC) of tumor volume as main outcome measures for inter-rater agreement.

Results: For glioblastoma, segmentations by experts and novices were comparable. The inter-rater agreement of enhancing tumor elements was excellent before surgery (GCI 0.79, ICC 0.99) poor after surgery (GCI 0.32, ICC 0.92), and good at progression (GCI 0.65, ICC 0.91). For non-glioblastoma, the inter-rater agreement was generally higher between experts than between novices. The inter-rater agreement was excellent between experts before surgery (GCI 0.77, ICC 0.92), was reasonable after surgery (GCI 0.48, ICC 0.84), and good at progression (GCI 0.60, ICC 0.80). The inter-rater agreement was good between novices before surgery (GCI 0.66, ICC 0.73), was poor after surgery (GCI 0.33, ICC 0.55), and poor at progression (GCI 0.36, ICC 0.73). Further analysis showed that the lower inter-rater agreement of segmentation on postoperative MRI could only partly be explained by the smaller volumes and fragmentation of residual tumor. The median interquartile range of extent of resection between raters was 8.3% and of growth rate was 0.22 mm/year.

Conclusion: Manual tumor segmentations on MRI have reasonable agreement for use in spatial and volumetric analysis. Agreement in spatial overlap is of concern with segmentation after surgery for glioblastoma and with segmentation of non-glioblastoma by non-experts.

Keywords: Glioblastoma; Glioma; Inter-rater agreement; Low-grade glioma; MRI; Manual segmentation.

PubMed Disclaimer

Figures

Unlabelled Image
Graphical abstract
Fig. 1
Fig. 1
Bar plots of the number of patients with corresponding number of expert (EX) and novice (NO) raters detecting any enhancing tumor and any non-enhancing tumor for glioblastoma and non-glioblastoma in MRIs preoperative, postoperative and at progression.
Fig. 2
Fig. 2
Box plots of the spatial overlap among experts (EX) and novices (NO) measured as generalized conformity index for enhancing tumor and non-enhancing tumor segmentations of 20 glioblastoma and 20 non-glioblastoma patients in MRIs taken at preoperative, postoperative and progression time points. Each dot represents the agreement among raters for one patient's MRI. Indices above 0.7 are considered excellent. The median of measurements and interquartile distances are plotted as boxes, which were omitted when fewer than five data points were present. Few data points were available for enhancing tumor segmentations in non-glioblastoma, because the generalized conformity index could not be calculated when fewer than two observers detected tumor.
Fig. 3
Fig. 3
Spatial overlap agreement as generalized conformity index versus tumor volume (average over experts) of enhancing tumor (A) and non-enhancing tumor (B) segmentations for glioblastomas and non-glioblastomas at subsequent MRI timings. Each dot represents the agreement of spatial overlap among experts on one patient's MRI. For enhancing tumor at postoperative phase it is shown that spatial overlap increases after artificial dilation of segmentation (grey dots), however not to the level of progression segmentation of the same volume.
Fig. 4
Fig. 4
Box plots of agreement between majority vote of all eight raters and each of the individual raters, as Jaccard index for enhancing tumor and non-enhancing tumor segmentations in glioblastoma and non-glioblastoma at the three MRI time points. Each dot represents the agreement between the consensus and the individual rater for one patient's segmentation. The first four subplots represent the experts, the second four refer to the novices. The median of measurements and interquartile distances are plotted as boxes, which were omitted when fewer than five data points were measured.
Fig. 5
Fig. 5
Boxplots of agreement between rater and majority vote consensus of experts and novices combined measured as Jaccard index for enhancing and non-enhancing tumor segmentations in glioblastoma and non-glioblastoma at three MRI timings. Each dot represents the agreement between a rater's segmentations and the majority vote consensus of all raters for one patient's segmentation. Indices above 0.7 are considered excellent. The median of measurements and interquartile distances are plotted as boxes, which were omitted when fewer than five data points were measured.
Fig. 6
Fig. 6
The variation in extent of resection and growth rate for glioblastoma and non-glioblastoma between eight raters per patient. In each plot patients are sorted by median extent of resection and growth rate, respectively. Each dot represents the calculation for one patient of one rater. Experts and novices are labelled according to the legend. The median of measurements and interquartile distances are plotted as boxes. The quartile coefficients of dispersion are plotted below the boxplots.

Similar articles

  • Reliability of Semi-Automated Segmentations in Glioblastoma.
    Huber T, Alber G, Bette S, Boeckh-Behrens T, Gempt J, Ringel F, Alberts E, Zimmer C, Bauer JS. Huber T, et al. Clin Neuroradiol. 2017 Jun;27(2):153-161. doi: 10.1007/s00062-015-0471-2. Epub 2015 Oct 21. Clin Neuroradiol. 2017. PMID: 26490369
  • Intra-rater variability in low-grade glioma segmentation.
    Bø HK, Solheim O, Jakola AS, Kvistad KA, Reinertsen I, Berntsen EM. Bø HK, et al. J Neurooncol. 2017 Jan;131(2):393-402. doi: 10.1007/s11060-016-2312-9. Epub 2016 Nov 11. J Neurooncol. 2017. PMID: 27837437
  • Glioblastoma Segmentation: Comparison of Three Different Software Packages.
    Fyllingen EH, Stensjøen AL, Berntsen EM, Solheim O, Reinertsen I. Fyllingen EH, et al. PLoS One. 2016 Oct 25;11(10):e0164891. doi: 10.1371/journal.pone.0164891. eCollection 2016. PLoS One. 2016. PMID: 27780224 Free PMC article.
  • Timing of postoperative magnetic resonance imaging (MRI) following glioma resection: Shattering the 72 hour window.
    Bukhari SS, Shamim MS, Mubarak F. Bukhari SS, et al. J Pak Med Assoc. 2019 Aug;69(8):1224-1225. J Pak Med Assoc. 2019. PMID: 31431787 Review.
  • The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS).
    Menze BH, Jakab A, Bauer S, Kalpathy-Cramer J, Farahani K, Kirby J, Burren Y, Porz N, Slotboom J, Wiest R, Lanczi L, Gerstner E, Weber MA, Arbel T, Avants BB, Ayache N, Buendia P, Collins DL, Cordier N, Corso JJ, Criminisi A, Das T, Delingette H, Demiralp Ç, Durst CR, Dojat M, Doyle S, Festa J, Forbes F, Geremia E, Glocker B, Golland P, Guo X, Hamamci A, Iftekharuddin KM, Jena R, John NM, Konukoglu E, Lashkari D, Mariz JA, Meier R, Pereira S, Precup D, Price SJ, Raviv TR, Reza SM, Ryan M, Sarikaya D, Schwartz L, Shin HC, Shotton J, Silva CA, Sousa N, Subbanna NK, Szekely G, Taylor TJ, Thomas OM, Tustison NJ, Unal G, Vasseur F, Wintermark M, Ye DH, Zhao L, Zhao B, Zikic D, Prastawa M, Reyes M, Van Leemput K. Menze BH, et al. IEEE Trans Med Imaging. 2015 Oct;34(10):1993-2024. doi: 10.1109/TMI.2014.2377694. Epub 2014 Dec 4. IEEE Trans Med Imaging. 2015. PMID: 25494501 Free PMC article. Review.

Cited by

References

    1. Amelot A., Deroulers C., Badoual M., Polivka M., Adle-Biassette H., Houdart E., Carpentier A.F., Froelich S., Mandonnet E. Surgical decision making from image-based biophysical modeling of glioblastoma: not ready for primetime. Neurosurgery. 2017;80:793–799. - PubMed
    1. Bakas S., Akbari H., Sotiras A., Bilello M., Rozycki M., Kirby J.S., Freymann J.B., Farahani K., Davatzikos C. Advancing the cancer genome Atlas glioma MRI collections with expert segmentation labels and radiomic features. Sci. Data. 2017;4:1–13. - PMC - PubMed
    1. Bartko J.J. Measurement and reliability: statistical thinking considerations. Schizophr. Bull. 1991;17:483–489. - PubMed
    1. Ben Abdallah M., Blonski M., Wantz-Mezieres S., Gaudeau Y., Taillandier L., Moureaux J.-M. 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) IEEE; 2016. Statistical evaluation of manual segmentation of a diffuse low-grade glioma MRI dataset; pp. 4403–4406. - PubMed
    1. Ben Abdallah M., Blonski M., Wantz-Mézières S., Gaudeau Y., Taillandier L., Moureaux J.-M. Relevance of two manual tumour volume estimation methods for diffuse low-grade gliomas. Health. Technol. Lett. 2018;5:13–17. - PMC - PubMed

Publication types