. 2008 Jul 25;4(7):e1000130.

doi: 10.1371/journal.pgen.1000130.

Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies

Clive J Hoggart¹, John C Whittaker, Maria De Iorio, David J Balding

Affiliations

PMID: 18654633
PMCID: PMC2464715
DOI: 10.1371/journal.pgen.1000130

Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies

Clive J Hoggart et al. PLoS Genet. 2008.

. 2008 Jul 25;4(7):e1000130.

doi: 10.1371/journal.pgen.1000130.

Authors

Clive J Hoggart¹, John C Whittaker, Maria De Iorio, David J Balding

Affiliation

¹ Department of Epidemiology and Public Health, Imperial College, London, United Kingdom. c.hoggart@ic.ac.uk

PMID: 18654633
PMCID: PMC2464715
DOI: 10.1371/journal.pgen.1000130

Abstract

Testing one SNP at a time does not fully realise the potential of genome-wide association studies to identify multiple causal variants, which is a plausible scenario for many complex diseases. We show that simultaneous analysis of the entire set of SNPs from a genome-wide study to identify the subset that best predicts disease outcome is now feasible, thanks to developments in stochastic search methods. We used a Bayesian-inspired penalised maximum likelihood approach in which every SNP can be considered for additive, dominant, and recessive contributions to disease risk. Posterior mode estimates were obtained for regression coefficients that were each assigned a prior with a sharp mode at zero. A non-zero coefficient estimate was interpreted as corresponding to a significant SNP. We investigated two prior distributions and show that the normal-exponential-gamma prior leads to improved SNP selection in comparison with single-SNP tests. We also derived an explicit approximation for type-I error that avoids the need to use permutation procedures. As well as genome-wide analyses, our method is well-suited to fine mapping with very dense SNP sets obtained from re-sequencing and/or imputation. It can accommodate quantitative as well as case-control phenotypes, covariate adjustment, and can be extended to search for interactions. Here, we demonstrate the power and empirical type-I error of our approach using simulated case-control data sets of up to 500 K SNPs, a real genome-wide data set of 300 K SNPs, and a sequence-based dataset, each of which can be analysed in a few hours on a desktop workstation.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Figure 1. Logarithms of NEG and DE densities.**
Fixed to have the same density at the origin.

**Figure 2. Main simulation study.**
Histograms of the number of selected SNPs tagging (at r ²>0.05) each causal SNP for (A) NEG and (B) ATT analyses.

**Figure 3. GWA simulation.**
(A) locations of the ten causal variants (vertical blue line) on the 20 Mb chromosome; also shown are the SNPs selected by NEG (red dots), and the SNPs with ATT p-value 5×10⁻⁷ (black dots) plotted against −log₁₀ (p-value). (B) and (C) show zooms of two sub-intervals of (A).

**Figure 4. Re-sequencing simulation.**
Histograms of the maximum r ² for each selected SNP with a causal variant for (A) NEG and (B) ATT analyses.

See this image and copyright information in PMC

References

1. Genkin A, Lewis DD, Madigan D. Large-scale Bayesian logistic regression for text categorization. Technometrics. 2007;49(3):291–304.
1. Griffin JE, Brown PJ. Bayesian adaptive Lassos with non–convex penalization. 2007. Technical report, University of Kent.
1. Breiman L. Heuristics of instability and stabilization in model selection. Annals of Statistics. 1996;24:2350–38.
1. Mitchel TJ, Beauchamp JJ. Bayesian variable selection in linear regression. J Am Stat Ass. 1988;83:1023–1032.
1. George EI, McCulloch RI. Variable selection via Gibbs sampling. J Am Stat Ass. 1993;88:881–889.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

G0300766/MRC_/Medical Research Council/United Kingdom

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Medical
- MedlinePlus Health Information
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies

Affiliation

Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Research Materials