Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Oct;23(10):959-975.
doi: 10.3348/kjr.2022.0067.

Agreement and Reliability between Clinically Available Software Programs in Measuring Volumes and Normative Percentiles of Segmented Brain Regions

Affiliations

Agreement and Reliability between Clinically Available Software Programs in Measuring Volumes and Normative Percentiles of Segmented Brain Regions

Huijin Song et al. Korean J Radiol. 2022 Oct.

Erratum in

Abstract

Objective: To investigate the agreement and reliability of estimating the volumes and normative percentiles (N%) of segmented brain regions among NeuroQuant (NQ), DeepBrain (DB), and FreeSurfer (FS) software programs, focusing on the comparison between NQ and DB.

Materials and methods: Three-dimensional T1-weighted images of 145 participants (48 healthy participants, 50 patients with mild cognitive impairment, and 47 patients with Alzheimer's disease) from a single medical center (SMC) dataset and 130 participants from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset were included in this retrospective study. All images were analyzed with DB, NQ, and FS software to obtain volume estimates and N% of various segmented brain regions. We used Bland-Altman analysis, repeated measures ANOVA, reproducibility coefficient, effect size, and intraclass correlation coefficient (ICC) to evaluate inter-method agreement and reliability.

Results: Among the three software programs, the Bland-Altman plot showed a substantial bias, the ICC showed a broad range of reliability (0.004-0.97), and repeated-measures ANOVA revealed significant mean volume differences in all brain regions. Similarly, the volume differences of the three software programs had large effect sizes in most regions (0.73-5.51). The effect size was largest in the pallidum in both datasets and smallest in the thalamus and cerebral white matter in the SMC and ADNI datasets, respectively. N% of NQ and DB showed an unacceptably broad Bland-Altman limit of agreement in all brain regions and a very wide range of ICC values (-0.142-0.844) in most brain regions.

Conclusion: NQ and DB showed significant differences in the measured volume and N%, with limited agreement and reliability for most brain regions. Therefore, users should be aware of the lack of interchangeability between these software programs when they are applied in clinical practice.

Keywords: DeepBrain; FreeSurfer; Intermethod validation; MR volumetry; NeuroQuant; Normative percentile.

PubMed Disclaimer

Conflict of interest statement

Seung Hong Choi who is on the editorial board of the <i>Korean Journal of Radiology</i> was not involved in the editorial evaluation or decision to publish this article. All remaining authors have declared no conflicts of interest.

Figures

Fig. 1
Fig. 1. Study design flow chart.
AD = Alzheimer’s disease, ADNI = Alzheimer’s Disease Neuroimaging Initiative, MCI = mild cognitive impairment
Fig. 2
Fig. 2. Representative case of a 69-year-old female with Alzheimer’s disease.
A-D. Axial T1-weighted imaging (A); color overlays based on FreeSurfer (B), DeepBrain (C), and NeuroQuant (D). The overlaid area of the bilateral globus pallidus (marked with stars) is smaller with NeuroQuant (D) than with FreeSurfer (B) or DeepBrain (C).
Fig. 3
Fig. 3. Box-and-whisker plots illustrate differences in measured regional brain volume derived from NQ, DB, and FS in a SMC and ADNI data.
A-D. SMC (A) and ADNI (B) show smaller brain regions (caudate, pallidum, putamen, thalamus, amygdala, and hippocampus), and SMC (C) and ADNI (D) show the cortical gray matter, cerebral white matter, cerebellum, and total intracranial volume. The lines inside the boxes and the lower and upper boundary lines represent the median, 25th, and 75th percentile values, respectively, with whiskers extending from the median to the ± 1.5 × interquartile range; outliers beyond the whiskers are represented by points. ADNI = Alzheimer’s Disease Neuroimaging Initiative, DB = DeepBrain, FS = FreeSurfer, NQ = NeuroQuant, SMC = single medical center
Fig. 4
Fig. 4. Bland–Altman plots for agreement between each software for regional brain volume.
A, B. Represent SMC and ADNI data, respectively. The units for both the x- and y-axes are cm3. There is a tendency for NQ to overestimate large volumes and underestimate small volumes compared with FS measurement of cerebral cortical GM in both SMC (A) and ADNI data (B). In contrast, DeepBrain slightly tends to underestimate large volumes and overestimate small volumes compared with FS measurement of the cerebral cortical GM in both A and B. The orange circle, brown square, and purple circle in A indicate the Alzheimer’s disease, mild cognitive impairment, and normal elderly subgroups, respectively. The blue triangle, red square, and green circle in B indicate the 1.5T Siemens, 3T GE, and 3T Phillips subgroups, respectively. The brown horizontal dashed lines delineate the 95% confidence intervals (the likelihood of individual measures to be within ± 1.96 SDs). The orange horizontal dashed line represents the equal (the difference between two software measurements is zero) line. The blue horizontal line indicates the difference between two software measurements. ADNI = Alzheimer’s Disease Neuroimaging Initiative, DB = DeepBrain, FS = FreeSurfer, GM = gray matter, NQ = NeuroQuant, SD = standard deviation, SMC = single medical center
Fig. 5
Fig. 5. Box-and-whisker plots showing differences in N% of regional brain volume derived from NQ and DB.
A, B. Represent N% of the single medical center and ADNI data, respectively. The lines inside the boxes and the lower and upper boundary lines represent the median, 25th, and 75th percentile values, respectively, with whiskers extending from the median to ± 1.5 × the interquartile range; outliers beyond the whiskers are represented by points. ADNI = Alzheimer’s Disease Neuroimaging Initiative, DB = DeepBrain, GM = gray matter, NQ = NeuroQuant, N% = normative percentiles, TICV = total intracranial volume, WM = white matter
Fig. 6
Fig. 6. Bland-Altman plots for agreement of the normative percentile of the hippocampus (A, C, E) and cortical gray matter (B, D, F) between NQ and DB.
A-F. A and B represent SMC data, and C-F represent ADNI data. There is a tendency of a triangular or rhomboid shape on the Bland–Altman plot with unacceptably broad limits of agreement for all datasets. In A-D, the orange circle, brown square, and purple circle indicate the Alzheimer’s disease, MCI, and NL control subgroups, respectively, in both datasets. In E and F, the blue triangle, red square, and green circle indicate the 1.5T Siemens, 3T GE, and 3T Phillips subgroups, respectively. The brown horizontal dashed lines delineate the 95% confidence intervals (the likelihood of individual measures to be within ± 1.96 SDs). The orange horizontal dashed line represents the equal (the difference between two software measurements is zero) line. The blue horizontal line is the mean difference of two software measurements. ADNI = Alzheimer’s Disease Neuroimaging Initiative, DB = DeepBrain, MCI = mild cognitive impairment, NL = normal elderly participants, NQ = NeuroQuant, SD = standard deviation, SMC = single medical center

References

    1. Braak H, Braak E. Neuropathological stageing of Alzheimer-related changes. Acta Neuropathol. 1991;82:239–259. - PubMed
    1. Scahill RI, Schott JM, Stevens JM, Rossor MN, Fox NC. Mapping the evolution of regional atrophy in Alzheimer’s disease: unbiased analysis of fluid-registered serial MRI. Proc Natl Acad Sci U S A. 2002;99:4703–4707. - PMC - PubMed
    1. Lehéricy S, Baulac M, Chiras J, Piérot L, Martin N, Pillon B, et al. Amygdalohippocampal MR volume measurements in the early stages of Alzheimer disease. AJNR Am J Neuroradiol. 1994;15:929–937. - PMC - PubMed
    1. Chan D, Fox NC, Scahill RI, Crum WR, Whitwell JL, Leschziner G, et al. Patterns of temporal lobe atrophy in semantic dementia and Alzheimer’s disease. Ann Neurol. 2001;49:433–442. - PubMed
    1. Killiany RJ, Hyman BT, Gomez-Isla T, Moss MB, Kikinis R, Jolesz F, et al. MRI measures of entorhinal cortex vs hippocampus in preclinical AD. Neurology. 2002;58:1188–1196. - PubMed