Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Meta-Analysis
. 2018 Feb 13;115(7):1481-1486.
doi: 10.1073/pnas.1719747115. Epub 2018 Jan 31.

Statistical tests and identifiability conditions for pooling and analyzing multisite datasets

Collaborators, Affiliations
Meta-Analysis

Statistical tests and identifiability conditions for pooling and analyzing multisite datasets

Hao Henry Zhou et al. Proc Natl Acad Sci U S A. .

Abstract

When sample sizes are small, the ability to identify weak (but scientifically interesting) associations between a set of predictors and a response may be enhanced by pooling existing datasets. However, variations in acquisition methods and the distribution of participants or observations between datasets, especially due to the distributional shifts in some predictors, may obfuscate real effects when datasets are combined. We present a rigorous statistical treatment of this problem and identify conditions where we can correct the distributional shift. We also provide an algorithm for the situation where the correction is identifiable. We analyze various properties of the framework for testing model fit, constructing confidence intervals, and evaluating consistency characteristics. Our technical development is motivated by Alzheimer's disease (AD) studies, and we present empirical results showing that our framework enables harmonizing of protein biomarkers, even when the assays across sites differ. Our contribution may, in part, mitigate a bottleneck that researchers face in clinical research when pooling smaller sized datasets and may offer benefits when the subjects of interest are difficult to recruit or when resources prohibit large single-site studies.

Keywords: causal model; maximum mean discrepancy; meta-analysis; multisite analysis; multisource.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
A shows the distributional shift of Aβ142 across ADNI and W-ADRC. B shows the distributional shift of hippocampus volume across ADNI and W-ADRC.
Fig. 2.
Fig. 2.
A is an example of a graphical causal model. The colored nodes are an example of a d-separation rule, where I and J are d-separated by {O1,O2,O3}. B is the graphical causal model for our CSF data analysis example. Here, the population characteristics difference EP only has a direct causal effect on the age distribution. The sample selection bias EB is only directly related to diagnosis status D for each specific study. Nodes denoting age and sex influence the CSF measurements denoted by X, which then influence the diagnosis status D. The CSF measurements X and the nodes EP and EB are d-separated by diagnosis status D and age.
Fig. 3.
Fig. 3.
The plots of (A) Aβ142 and (B) p-tau/Aβ142 show the empirical distributions of W-ADRC samples (blue), ADNI samples (red), and transformed ADNI samples (brown). W-ADRC samples are nicely matched with transformed ADNI samples.
Fig. 4.
Fig. 4.
A shows the trend of MSPE for hippocampus volume as the sample size increases using 400 bootstraps. The bar plot covers the prediction error for three types of training set as depicted in the legend, including W-ADRC only (red), W-ADRC plus ADNI (green), and W-ADRC plus transformed ADNI (blue). The third model continues to perform the best. B shows the trend of classification accuracy with respect to patients with AD (solid lines) and healthy patients (dotted lines) as sample size increases using 400 bootstraps. An SVM model is used, and three types of training sets are shown in the legend. For samples with AD, the three methods converge to the same accuracy as the training sample size increases. For healthy CNs, the W-ADRC plus the transformed ADNI dataset is always better than the other two schemes. It is interesting to see that W-ADRC plus the raw ADNI data also performs better than W-ADRC alone, possibly because only 25 (24%) subjects from W-ADRC are diagnosed with AD—with few AD samples, even the uncorrected ADNI data nicely inform the classification model.

References

    1. Fortin JM, Currie DJ. Big science vs. little science: How scientific impact scales with funding. PLoS One. 2013;8:e65263. - PMC - PubMed
    1. Buerger K, et al. Validation of Alzheimer’s disease CSF and plasma biological markers: The multicentre reliability study of the pilot european Alzheimer’s disease neuroimaging initiative (E-ADNI) Exp Gerontol. 2009;44:579–585. - PubMed
    1. Vanderstichele H, et al. Standardization of preanalytical aspects of cerebrospinal fluid biomarker testing for Alzheimer’s disease diagnosis: A consensus paper from the Alzheimer’s biomarkers standardization initiative. Alzheimers Dement. 2012;8:65–73. - PubMed
    1. Dubois B, et al. Revising the definition of Alzheimer’s disease: A new lexicon. Lancet Neurol. 2010;9:1118–1127. - PubMed
    1. Carrillo MC, et al. Research and standardization in Alzheimer’s trials: Reaching international consensus. Alzheimers Dement. 2013;9:160–168. - PubMed

Publication types

LinkOut - more resources