Large-scale risk prediction applied to Genetic Analysis Workshop 17 mini-exome sequence data
- PMID: 22373389
- PMCID: PMC3287883
- DOI: 10.1186/1753-6561-5-S9-S46
Large-scale risk prediction applied to Genetic Analysis Workshop 17 mini-exome sequence data
Abstract
We consider the application of Efron's empirical Bayes classification method to risk prediction in a genome-wide association study using the Genetic Analysis Workshop 17 (GAW17) data. A major advantage of using this method is that the effect size distribution for the set of possible features is empirically estimated and that all subsequent parameter estimation and risk prediction is guided by this distribution. Here, we generalize Efron's method to allow for some of the peculiarities of the GAW17 data. In particular, we introduce two ways to extend Efron's model: a weighted empirical Bayes model and a joint covariance model that allows the model to properly incorporate the annotation information of single-nucleotide polymorphisms (SNPs). In the course of our analysis, we examine several aspects of the possible simulation model, including the identity of the most important genes, the differing effects of synonymous and nonsynonymous SNPs, and the relative roles of covariates and genes in conferring disease risk. Finally, we compare the three methods to each other and to other classifiers (random forest and neural network).
Figures


Similar articles
-
The application of Efron's bootstrap methods in validating feature classification using artificial neural networks for the analysis of mammographic masses.Conf Proc IEEE Eng Med Biol Soc. 2004;2004:1553-6. doi: 10.1109/IEMBS.2004.1403474. Conf Proc IEEE Eng Med Biol Soc. 2004. PMID: 17271994
-
An Empirical Bayes risk prediction model using multiple traits for sequencing data.Stat Appl Genet Mol Biol. 2015 Dec;14(6):551-73. doi: 10.1515/sagmb-2015-0060. Stat Appl Genet Mol Biol. 2015. PMID: 26641974
-
Collapsing-based and kernel-based single-gene analyses applied to Genetic Analysis Workshop 17 mini-exome data.BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S117. doi: 10.1186/1753-6561-5-S9-S117. eCollection 2011. BMC Proc. 2011. PMID: 22373309 Free PMC article.
-
An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies.PLoS Genet. 2015 Dec 29;11(12):e1005717. doi: 10.1371/journal.pgen.1005717. eCollection 2015 Dec. PLoS Genet. 2015. PMID: 26714184 Free PMC article.
-
Sequentially adjusted randomization to force balance in controlled trials with unknown prevalence of covariates: application to alcoholism research.Alcohol Alcohol. 2005 Mar-Apr;40(2):124-31. doi: 10.1093/alcalc/agh131. Epub 2005 Jan 10. Alcohol Alcohol. 2005. PMID: 15642723
Cited by
-
New Empirical Bayes Models to Jointly Analyze Multiple RNA-Sequencing Data in a Hypophosphatasia Disease Study.Genes (Basel). 2024 Mar 26;15(4):407. doi: 10.3390/genes15040407. Genes (Basel). 2024. PMID: 38674342 Free PMC article.
-
Inflated type I error rates when using aggregation methods to analyze rare variants in the 1000 Genomes Project exon sequencing data in unrelated individuals: summary results from Group 7 at Genetic Analysis Workshop 17.Genet Epidemiol. 2011;35 Suppl 1(Suppl 1):S56-60. doi: 10.1002/gepi.20650. Genet Epidemiol. 2011. PMID: 22128060 Free PMC article.
References
-
- Tibshirani R. Regression shrinkage and selection via the Lasso. J R Stat Soc B. 1996;58:267–288.
-
- Robert C. The Bayesian Choice. 2nd. New York, Springer Texts in Statistics; 2001.
LinkOut - more resources
Full Text Sources