. 2011 Feb 1;54(3):2163-75.

doi: 10.1016/j.neuroimage.2010.09.076. Epub 2010 Oct 13.

Multisite reliability of cognitive BOLD data

Gregory G Brown¹, Daniel H Mathalon, Hal Stern, Judith Ford, Bryon Mueller, Douglas N Greve, Gregory McCarthy, James Voyvodic, Gary Glover, Michele Diaz, Elizabeth Yetter, I Burak Ozyurt, Kasper W Jorgensen, Cynthia G Wible, Jessica A Turner, Wesley K Thompson, Steven G Potkin; Function Biomedical Informatics Research Network

Affiliations

PMID: 20932915
PMCID: PMC3009557
DOI: 10.1016/j.neuroimage.2010.09.076

Multisite reliability of cognitive BOLD data

Gregory G Brown et al. Neuroimage. 2011.

. 2011 Feb 1;54(3):2163-75.

doi: 10.1016/j.neuroimage.2010.09.076. Epub 2010 Oct 13.

Authors

Affiliation

¹ VA San Diego Healthcare System and University of California San Diego, Department of Psychiatry, San Diego, CA 92161, USA. gbrown@ucsd.edu

PMID: 20932915
PMCID: PMC3009557
DOI: 10.1016/j.neuroimage.2010.09.076

Abstract

Investigators perform multi-site functional magnetic resonance imaging studies to increase statistical power, to enhance generalizability, and to improve the likelihood of sampling relevant subgroups. Yet undesired site variation in imaging methods could off-set these potential advantages. We used variance components analysis to investigate sources of variation in the blood oxygen level-dependent (BOLD) signal across four 3-T magnets in voxelwise and region-of-interest (ROI) analyses. Eighteen participants traveled to four magnet sites to complete eight runs of a working memory task involving emotional or neutral distraction. Person variance was more than 10 times larger than site variance for five of six ROIs studied. Person-by-site interactions, however, contributed sizable unwanted variance to the total. Averaging over runs increased between-site reliability, with many voxels showing good to excellent between-site reliability when eight runs were averaged and regions of interest showing fair to good reliability. Between-site reliability depended on the specific functional contrast analyzed in addition to the number of runs averaged. Although median effect size was correlated with between-site reliability, dissociations were observed for many voxels. Brain regions where the pooled effect size was large but between-site reliability was poor were associated with reduced individual differences. Brain regions where the pooled effect size was small but between-site reliability was excellent were associated with a balance of participants who displayed consistently positive or consistently negative BOLD responses. Although between-site reliability of BOLD data can be good to excellent, acquiring highly reliable data requires robust activation paradigms, ongoing quality assurance, and careful experimental control.

Published by Elsevier Inc.

PubMed Disclaimer

Figures

**Figure 1**
Working Memory Task (WM) with Emotional Distraction. Each blue square represents an acquired echo-planar volume. Every WM epoch is preceded and followed by passively viewed scrambled faces. During the encode period, eight line drawings of common objects are presented at a 2 s rate. During the maintain period, individuals are instructed to remember the eight learned pictures while deciding whether intervening pictures included a human face. The block of pictures presented during the maintain period was composed of either emotionally aversive or neutral pictures. During the recognition probe period, individuals decided which of two line drawing pictures was studied during the encode period. Decisions during the maintain and recognition periods were indicated by a button press. An orientation cross was presented 6 s before and after each WM-scrambled faces cycle.

**Figure 2**
Variance components for the recognition probe versus scrambled faces contrast.

**Figure 3**
Data from the recognition probe versus scrambled faces contrast. (A) Between-site reliability for increasing numbers of runs averaged. (B) Within-site, between-session reliability for eight runs averaged at the site where scans were repeated (Site D).

**Figure 4**
Percent of voxels in a particular between-site reliability bin significantly activated at three different levels of significance based on the pooled t-test of recognition versus scrambled contrast. Reliability intervals were based on the eight-run between-site reliability map.

**Figure 5**
(A). Effect size (Cohen’s d) by binned intervals of between-site reliability. The effect size value was derived from the single-sample t-test comparing recognition probes with scrambled pictures. The between-site reliability values were obtained from the eight-run between-site reliability maps. See Figure 3A. (B) Sample sizes required to detect a significant effect for a one-tailed test at three α-levels and three levels of power for medium to large effect sizes (Cohen’s d .5 to .8).

**Figure 6**
Mean percent signal change for voxels with high between-site reliability and small effect sizes. Solid lines represent the performance of each subject at each of the four study sites.

**Figure 7**
Intraclass correlation (ICC) for the recognition probe versus scrambled faces contrast in selected regions of interest as a function of the number of runs averaged.

**Figure 8**
Between-site reliability for increasing numbers of runs averaged. Recognition probe following emotional distraction versus recognition probe following neutral distraction.

See this image and copyright information in PMC

References

1. Beckmann CF, Jenkinson M, Smith SM. General multilevel linear modeling for group analysis in FMRI. Neuroimage. 2003;20(2):1052–1063. - PubMed
1. Blair J, Spreen O. Predicting premorbid IQ: a revison of the natinal adult reading test. The clinical neuropsychologist. 1989;3:129–136.
1. Bosnell R, Wegner C, Kincses ZT, Korteweg T, Agosta F, Ciccarelli O, De Stefano N, Gass A, Hirsch J, Johansen-Berg H, et al. Reproducibility of fMRI in the clinical setting: implications for trial designs. Neuroimage. 2008;42(2):603–610. - PubMed
1. Brennan RL. Generalizability Theory. New York: Springer-Verlag; 2001.
1. Caceres A, Hall DL, Zelaya FO, Williams SC, Mehta MA. Measuring fMRI reliability with the intra-class correlation coefficient. Neuroimage. 2009;45(3):758–768. - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

U24 RR021992/RR/NCRR NIH HHS/United States

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Multisite reliability of cognitive BOLD data

Affiliation

Multisite reliability of cognitive BOLD data

Authors

Affiliation

Abstract

Figures

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Medical