Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2014 Feb;13(1):35-48.
doi: 10.1515/sagmb-2012-0040.

Multiple comparisons in genetic association studies: a hierarchical modeling approach

Comparative Study

Multiple comparisons in genetic association studies: a hierarchical modeling approach

Nengjun Yi et al. Stat Appl Genet Mol Biol. 2014 Feb.

Abstract

Multiple comparisons or multiple testing has been viewed as a thorny issue in genetic association studies aiming to detect disease-associated genetic variants from a large number of genotyped variants. We alleviate the problem of multiple comparisons by proposing a hierarchical modeling approach that is fundamentally different from the existing methods. The proposed hierarchical models simultaneously fit as many variables as possible and shrink unimportant effects towards zero. Thus, the hierarchical models yield more efficient estimates of parameters than the traditional methods that analyze genetic variants separately, and also coherently address the multiple comparisons problem due to largely reducing the effective number of genetic effects and the number of statistically "significant" effects. We develop a method for computing the effective number of genetic effects in hierarchical generalized linear models, and propose a new adjustment for multiple comparisons, the hierarchical Bonferroni correction, based on the effective number of genetic effects. Our approach not only increases the power to detect disease-associated variants but also controls the Type I error. We illustrate and evaluate our method with real and simulated data sets from genetic association studies. The method has been implemented in our freely available R package BhGLM (http://www.ssg.uab.edu/bhglm/).

PubMed Disclaimer

Figures

Figure 1
Figure 1
Dallas heart study sequencing data. The left panel: the traditional single-SNP method separately analyzing each variant. The right panel: the proposed hierarchical normal linear model simultaneously fitting all the main effects of 339 variants. All the analyses include race, age, and gender as covariates in the model (not shown). The points, short lines and numbers at the right side represent estimates of effects, ± 2 standard errors, and original p-values, respectively. Only effects with p-value below 0.05 are labeled and blacked.
Figure 2
Figure 2
Adiponectin genes and colorectal cancer risk. The left panel: the traditional method analyzing two variants at a time. The right panel: the proposed hierarchical logistic regression simultaneously fitting all the main effects and the epistatic interactions. All the analyses include age and gender as covariates in the model (not shown). The points, short lines and numbers at the right side represent estimates of effects, ± 2 standard errors, and original p-values, respectively. Only effects with p-value below 0.05 are labeled and blacked.
Figure 3
Figure 3
Frequency of each effect estimated with original or adjusted p-values smaller than 0.05 over 1000 replicates. The left panel: the traditional method analyzing two variants at a time. The right panel: the proposed hierarchical logistic regression simultaneously fitting all the main effects and the epistatic interactions. All the analyses include age and gender as covariates in the model (not shown). The points (●) represent frequencies estimated with original p-values. The squares (■) represent frequencies estimated with the minimum p-values adjusted by the six previous methods. The circles (○) represent frequencies estimated with the p-values of the hierarchical Bonferroni correction. Only effects with non-zero simulated value are labeled with red color.

References

    1. Armagan A, Dunson D, Lee J. Bayesian generalized double Pareto shrinkage. Biometrika 2010 - PMC - PubMed
    1. Balding DJ. A tutorial on statistical methods for population association studies. Nat Rev Genet. 2006;7:781–791. - PubMed
    1. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B. 1995;57:289–300.
    1. Benjamini Y, Yekutieli D. The control of the false discovery rate in multiple testing under dependency. Annals of Statistics. 2001;29:1165–1188.
    1. Benjamini Y, Yekutieli D. Quantitative trait Loci analysis using the false discovery rate. Genetics. 2005;171:783–790. - PMC - PubMed

Publication types

LinkOut - more resources