Speeding up interval estimation for R2-based mediation effect of high-dimensional mediators via cross-fitting
- PMID: 39412139
- PMCID: PMC11823199
- DOI: 10.1093/biostatistics/kxae037
Speeding up interval estimation for R2-based mediation effect of high-dimensional mediators via cross-fitting
Abstract
Mediation analysis is a useful tool in investigating how molecular phenotypes such as gene expression mediate the effect of exposure on health outcomes. However, commonly used mean-based total mediation effect measures may suffer from cancellation of component-wise mediation effects in opposite directions in the presence of high-dimensional omics mediators. To overcome this limitation, we recently proposed a variance-based R-squared total mediation effect measure that relies on the computationally intensive nonparametric bootstrap for confidence interval estimation. In the work described herein, we formulated a more efficient two-stage, cross-fitted estimation procedure for the R2 measure. To avoid potential bias, we performed iterative Sure Independence Screening (iSIS) in two subsamples to exclude the non-mediators, followed by ordinary least squares regressions for the variance estimation. We then constructed confidence intervals based on the newly derived closed-form asymptotic distribution of the R2 measure. Extensive simulation studies demonstrated that this proposed procedure is much more computationally efficient than the resampling-based method, with comparable coverage probability. Furthermore, when applied to the Framingham Heart Study, the proposed method replicated the established finding of gene expression mediating age-related variation in systolic blood pressure and identified the role of gene expression profiles in the relationship between sex and high-density lipoprotein cholesterol level. The proposed estimation procedure is implemented in R package CFR2M.
Keywords: R 2 total mediation effect measure; confidence interval; cross-fitting; gene expression; iterative sure independence screening; mediation analysis.
© The Author 2024. Published by Oxford University Press. All rights reserved. For Permissions, email: journals.permissions@oup.com.
Conflict of interest statement
None declared.
Update of
-
Speeding up interval estimation for -based mediation effect of high-dimensional mediators via cross-fitting.bioRxiv [Preprint]. 2024 Sep 21:2023.02.06.527391. doi: 10.1101/2023.02.06.527391. bioRxiv. 2024. Update in: Biostatistics. 2024 Dec 31;26(1):kxae037. doi: 10.1093/biostatistics/kxae037. PMID: 36798366 Free PMC article. Updated. Preprint.
References
-
- Akaike H. 1998. Information theory and an extension of the maximum likelihood principle. In: Parzen E, Tanabe K, Kitagawa G, editors. Selected Papers of Hirotugu Akaike. Springer Series in Statistics. New York, NY: Springer. p. 199–213.
-
- Avin C, Shpitser I, Pearl J.. 2005. Identifiability of path-specific effects. In Proceedings of International Joint Conference on Artificial Intelligence (Edinburg, Schotland, UK; August 2005), pp. 357–363.
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources