Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comment
. 2023 Mar;615(7951):E8-E12.
doi: 10.1038/s41586-023-05746-w.

Reply to: Multivariate BWAS can be replicable with moderate sample sizes

Affiliations
Comment

Reply to: Multivariate BWAS can be replicable with moderate sample sizes

Brenden Tervo-Clemmens et al. Nature. 2023 Mar.
No abstract available

PubMed Disclaimer

Conflict of interest statement

D.A.F. and N.U.F.D. have a financial interest in Turing Medical and may financially benefit if the company is successful in marketing FIRMM motion monitoring software products. A.N.V., D.A.F. and N.U.F.D. may receive royalty income based on FIRMM technology developed at Washington University School of Medicine and Oregon Health and Sciences University and licensed to Turing Medical. D.A.F. and N.U.F.D. are co-founders of Turing Medical. These potential conflicts of interest have been reviewed and are managed by Washington University School of Medicine, Oregon Health and Sciences University and the University of Minnesota.

Figures

Fig. 1
Fig. 1. In-sample versus out-of-sample effect estimates in multivariate BWAS.
ae, Methods comparison between our previous study (split-half) and Spisak et al. (cross-validation followed by split-half). ‘Marek, Tervo-Clemmens’ and ‘Spisak’ refer to the methodolgies described in ref. and ref. , respectively. For ae, HCP 1200 Release (full correlation) data were used to predict age-adjusted total cognitive ability. Analysis code and visualizations (x,y scaling; colours) are the same as in Spisak et al.. The x axes in ae always display the split-half out-of-sample effect estimates from the second (replication) half of the data (correlation between true scores and predicted scores; as in Spisak et al. and in our previous study; Supplementary Methods). a, In-sample (training correlation; y axis) as a function of out-of-sample associations (plot convention in our previous study). b, Matched comparison of the true in-sample association (training correlations, mean across folds; y axis) in the method proposed by Spisak et al.. c, The proposed correction by Spisak et al. that inserts an additional cross-validation step to evaluate the first half of data, which by definition makes this an out-of-sample association (y axis). d, Replacing the cross-validation step from Spisak et al. with a split-half validation provides a different (compared with c) out-of-sample association of the first half of the total data (that is, each of the first stage split halves is one-quarter of the total data; y axis). The appropriate and direct comparison of in-sample associations between Spisak et al. and our previous study is comparing b to a, rather than c to a. The Spisak et al. method (cross-validation followed by split-half validation) does not reduce in-sample overfitting (b) but, instead, adds an additional out-of-sample evaluation (c), which is nearly identical to split-half validation twice in a row (d), and makes it clear why the out-of-sample performance of these two methods is likewise nearly identical. e, Correspondence between out-of-sample associations (to the left-out half) from the additional cross-validation step proposed by Spisak et al. (mean across folds; y axis) and the original split-half validation from our previous study (x axis). The identity line is shown in black. f, In-sample (r; light blue) and out-of-sample (r; dark blue) associations as a function of sample size. Data are from figure 4a–d of ref. . g, Published literature review of multivariate r (y axis) as a function of sample size (data from ref. ) displayed with permission. For f and g, best fit lines are displayed in log10 space. h, Overlap of f and g.
Fig. 2
Fig. 2. BWAS reproducibility, scope and prediction accuracy using the method of Spisak et al.
a, Example bootstrapped BWAS of total cognitive ability (green) and null distribution (black) (y axis), as a function of sample size (x axis) from the suggested method of Spisak et al. (RSFC by partial correlation; prediction by ridge regression) in the HCP dataset (n = 1,200, 1 site, 1 scanner, 60 min RSFC/participant, 76% white). Sample sizes were log10-transformed for visualization. b, Out-of-sample correlation (between true scores and predicted scores) from ridge regression (y axis; code from Spisak et al.) as a function of training sample size (x axis, log10 scaling) for 33 cognitive and mental health phenotypes (Supplementary Information) in the HCP dataset. Each line displays a smoothed fit estimate (through penalized splines in general additive models) for a brain (RSFC (partial correlations, as proposed by Spisak et al.), cortical thickness) phenotype pair (66 total) that has 100 bootstrapped iterations from sample sizes of 25 to 500 (inclusive) in increments of 25 (20 total bins). Sample sizes were log10-transformed (for visualization) before general additive model fitting. c, The same as in b, but in the ABCD dataset (n = 11,874, 21 sites, 3 scanner manufacturers, 20 min RSFC/participant, 56% white) using 32 cognitive and mental health phenotypes at sample sizes of 25, 50, 75 and from 100 to 1,900 (inclusive) in increments of 100 (22 total bins). d, The percentage of brain–phenotype pairs (BWAS) from b and c with significant replication on the basis of the method of Spisak et al. (Supplementary Information). e, Comparison of our original method in our previous study and the method proposed by Spisak et al. at the full split-half sample size of HCP (left) and ABCD (right). Out-of-sample correlations (RSFC with total cognitive ability, y axis) for the method used in our previous study (dark green; RSFC by correlation, PCA, SVR) and by Spisak et al. (light green; RSFC by partial correlation, ridge regression). Repeating the method proposed by Spisak et al. in ABCD (right) and comparing this to the method used in our previous study results in a very similar out-of-sample r. f, Simulated individual studies (light green circles; n = 1,000 per sample size) and meta-analytic estimates (black dot, ±1 s.d.) using the method of Spisak et al. (partial correlations in the HCP dataset) for the largest univariate association (left; y axis, bivariate correlation) and multivariate association (right; y axis, out-of-sample correlation) for total cognitive ability versus RSFC, as a function of total sample size (x axis; bivariate correlation for sample sizes of 50, 200 and 1,000, and multivariate sum of train and test samples, each 25, 100 and 500). For univariate approaches, studies of any sample size, when appropriately aggregated to a large total sample size, can correctly estimate the true effect size. However, for multivariate approaches, even when aggregating across 1,000 independent studies, studies with a small sample size produce prediction accuracies that are downwardly biased relative to large sample studies, highlighting the need for large samples in multivariate analyses.

Comment on

  • Reproducible brain-wide association studies require thousands of individuals.
    Marek S, Tervo-Clemmens B, Calabro FJ, Montez DF, Kay BP, Hatoum AS, Donohue MR, Foran W, Miller RL, Hendrickson TJ, Malone SM, Kandala S, Feczko E, Miranda-Dominguez O, Graham AM, Earl EA, Perrone AJ, Cordova M, Doyle O, Moore LA, Conan GM, Uriarte J, Snider K, Lynch BJ, Wilgenbusch JC, Pengo T, Tam A, Chen J, Newbold DJ, Zheng A, Seider NA, Van AN, Metoki A, Chauvin RJ, Laumann TO, Greene DJ, Petersen SE, Garavan H, Thompson WK, Nichols TE, Yeo BTT, Barch DM, Luna B, Fair DA, Dosenbach NUF. Marek S, et al. Nature. 2022 Mar;603(7902):654-660. doi: 10.1038/s41586-022-04492-9. Epub 2022 Mar 16. Nature. 2022. PMID: 35296861 Free PMC article.
  • Multivariate BWAS can be replicable with moderate sample sizes.
    Spisak T, Bingel U, Wager TD. Spisak T, et al. Nature. 2023 Mar;615(7951):E4-E7. doi: 10.1038/s41586-023-05745-x. Epub 2023 Mar 8. Nature. 2023. PMID: 36890392 Free PMC article. No abstract available.

References

    1. Marek S, et al. Reproducible brain-wide association studies require thousands of individuals. Nature. 2022;603:654–660. doi: 10.1038/s41586-022-04492-9. - DOI - PMC - PubMed
    1. Schönbrodt, F. D. & Perugini, M. At what sample size do correlations stabilize? J. Res. Pers.47, 609–612 (2013).
    1. Button KS, et al. Confidence and precision increase with high statistical power. Nat. Rev. Neurosci. 2013;14:585–586. doi: 10.1038/nrn3475-c4. - DOI - PubMed
    1. Varoquaux G. Cross-validation failure: small sample sizes lead to large error bars. Neuroimage. 2018;180:68–77. doi: 10.1016/j.neuroimage.2017.06.061. - DOI - PubMed
    1. Traut N, et al. Insights from an autism imaging biomarker challenge: promises and threats to biomarker discovery. Neuroimage. 2022;255:119171. doi: 10.1016/j.neuroimage.2022.119171. - DOI - PubMed

Grants and funding