Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Feb 1;54(3):2163-75.
doi: 10.1016/j.neuroimage.2010.09.076. Epub 2010 Oct 13.

Multisite reliability of cognitive BOLD data

Affiliations

Multisite reliability of cognitive BOLD data

Gregory G Brown et al. Neuroimage. .

Abstract

Investigators perform multi-site functional magnetic resonance imaging studies to increase statistical power, to enhance generalizability, and to improve the likelihood of sampling relevant subgroups. Yet undesired site variation in imaging methods could off-set these potential advantages. We used variance components analysis to investigate sources of variation in the blood oxygen level-dependent (BOLD) signal across four 3-T magnets in voxelwise and region-of-interest (ROI) analyses. Eighteen participants traveled to four magnet sites to complete eight runs of a working memory task involving emotional or neutral distraction. Person variance was more than 10 times larger than site variance for five of six ROIs studied. Person-by-site interactions, however, contributed sizable unwanted variance to the total. Averaging over runs increased between-site reliability, with many voxels showing good to excellent between-site reliability when eight runs were averaged and regions of interest showing fair to good reliability. Between-site reliability depended on the specific functional contrast analyzed in addition to the number of runs averaged. Although median effect size was correlated with between-site reliability, dissociations were observed for many voxels. Brain regions where the pooled effect size was large but between-site reliability was poor were associated with reduced individual differences. Brain regions where the pooled effect size was small but between-site reliability was excellent were associated with a balance of participants who displayed consistently positive or consistently negative BOLD responses. Although between-site reliability of BOLD data can be good to excellent, acquiring highly reliable data requires robust activation paradigms, ongoing quality assurance, and careful experimental control.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Working Memory Task (WM) with Emotional Distraction. Each blue square represents an acquired echo-planar volume. Every WM epoch is preceded and followed by passively viewed scrambled faces. During the encode period, eight line drawings of common objects are presented at a 2 s rate. During the maintain period, individuals are instructed to remember the eight learned pictures while deciding whether intervening pictures included a human face. The block of pictures presented during the maintain period was composed of either emotionally aversive or neutral pictures. During the recognition probe period, individuals decided which of two line drawing pictures was studied during the encode period. Decisions during the maintain and recognition periods were indicated by a button press. An orientation cross was presented 6 s before and after each WM-scrambled faces cycle.
Figure 2
Figure 2
Variance components for the recognition probe versus scrambled faces contrast.
Figure 3
Figure 3
Data from the recognition probe versus scrambled faces contrast. (A) Between-site reliability for increasing numbers of runs averaged. (B) Within-site, between-session reliability for eight runs averaged at the site where scans were repeated (Site D).
Figure 4
Figure 4
Percent of voxels in a particular between-site reliability bin significantly activated at three different levels of significance based on the pooled t-test of recognition versus scrambled contrast. Reliability intervals were based on the eight-run between-site reliability map.
Figure 5
Figure 5
(A). Effect size (Cohen’s d) by binned intervals of between-site reliability. The effect size value was derived from the single-sample t-test comparing recognition probes with scrambled pictures. The between-site reliability values were obtained from the eight-run between-site reliability maps. See Figure 3A. (B) Sample sizes required to detect a significant effect for a one-tailed test at three α-levels and three levels of power for medium to large effect sizes (Cohen’s d .5 to .8).
Figure 6
Figure 6
Mean percent signal change for voxels with high between-site reliability and small effect sizes. Solid lines represent the performance of each subject at each of the four study sites.
Figure 7
Figure 7
Intraclass correlation (ICC) for the recognition probe versus scrambled faces contrast in selected regions of interest as a function of the number of runs averaged.
Figure 8
Figure 8
Between-site reliability for increasing numbers of runs averaged. Recognition probe following emotional distraction versus recognition probe following neutral distraction.

References

    1. Beckmann CF, Jenkinson M, Smith SM. General multilevel linear modeling for group analysis in FMRI. Neuroimage. 2003;20(2):1052–1063. - PubMed
    1. Blair J, Spreen O. Predicting premorbid IQ: a revison of the natinal adult reading test. The clinical neuropsychologist. 1989;3:129–136.
    1. Bosnell R, Wegner C, Kincses ZT, Korteweg T, Agosta F, Ciccarelli O, De Stefano N, Gass A, Hirsch J, Johansen-Berg H, et al. Reproducibility of fMRI in the clinical setting: implications for trial designs. Neuroimage. 2008;42(2):603–610. - PubMed
    1. Brennan RL. Generalizability Theory. New York: Springer-Verlag; 2001.
    1. Caceres A, Hall DL, Zelaya FO, Williams SC, Mehta MA. Measuring fMRI reliability with the intra-class correlation coefficient. Neuroimage. 2009;45(3):758–768. - PubMed