Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Meta-Analysis
. 2009 Nov 15;170(10):1197-206.
doi: 10.1093/aje/kwp262. Epub 2009 Oct 6.

Discovery properties of genome-wide association signals from cumulatively combined data sets

Affiliations
Meta-Analysis

Discovery properties of genome-wide association signals from cumulatively combined data sets

Tiago V Pereira et al. Am J Epidemiol. .

Abstract

Genetic effects for common variants affecting complex disease risk are subtle. Single genome-wide association (GWA) studies are typically underpowered to detect these effects, and combination of several GWA data sets is needed to enhance discovery. The authors investigated the properties of the discovery process in simulated cumulative meta-analyses of GWA study-derived signals allowing for potential genetic model misspecification and between-study heterogeneity. Variants with null effects on average (but also between-data set heterogeneity) could yield false-positive associations with seemingly homogeneous effects. Random effects had higher than appropriate false-positive rates when there were few data sets. The log-additive model had the lowest false-positive rate. Under heterogeneity, random-effects meta-analyses of 2-10 data sets averaging 1,000 cases/1,000 controls each did not increase power, or the meta-analysis was even less powerful than a single study (power desert). Upward bias in effect estimates and underestimation of between-study heterogeneity were common. Fixed-effects calculations avoided power deserts and maximized discovery of association signals at the expense of much higher false-positive rates. Therefore, random- and fixed-effects models are preferable for different purposes (fixed effects for initial screenings, random effects for generalizability applications). These results may have broader implications for the design and interpretation of large-scale multiteam collaborative studies discovering common gene variants.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Cumulative power comparison among 3 main true genetic models by random-effects calculations (dominant in A, B, and C; log additive in D, E, and F; and recessive in G, H, and I) in meta-analyses of up to 30 data sets (with an average of 2,000 participants, a range of 1,000–3,000 in each, and a case-control ratio of 1.0), combining data from common gene variants (minor allele frequency, f = 0.4) with modest effect sizes (odds ratio = 1.3). In each panel, power is given under the correct and under misspecified genetic models. Models of analysis: dominant = squares; log additive (per allele risk) = triangles; and recessive = circles. Power is calculated by the proportion of simulated meta-analyses that exceed the threshold of P < 10−7. The region of power desert is illustrated by open symbols in panels C, E, F, and I. τ2, between-study variance.
Figure 2.
Figure 2.
Median bias in summary effect sizes (logarithm of the odds ratio) for different true underlying modes of inheritance (dominant in A, B, and C; log additive in D, E, and F; and recessive in G, H, and I) and misspecified genetic models under random-effects calculations. Models of analysis: dominant = squares; log additive (per allele risk) = triangles; and recessive = circles. Bias was calculated as the median ratio of the detected effect size and the true effect size computed from the set of meta-analyses showing statistically significant signals at P < 10−7. τ2, between-study variance.
Figure 3.
Figure 3.
Cumulative power comparison among 3 main true genetic models by fixed-effects calculations (dominant in A, B, and C; log additive in D, E, and F; and recessive in G, H, and I) in meta-analyses of up to 30 data sets (with an average of 2,000 participants, a range of 1,000–3,000 in each, and a case-control ratio of 1.0), combining data from common gene variants (minor allele frequency, f = 0.4) with modest effect sizes (odds ratio = 1.3). In each panel, power is given under the correct and under misspecified genetic models. Models of analysis: dominant = squares; log additive (per allele risk) = triangles; and recessive = circles. Power is calculated by the proportion of simulated meta-analyses that exceed the threshold of P < 10−7. τ2, between-study variance.
Figure 4.
Figure 4.
Bias in between-study variance (τ2) estimates for the set of meta-analyses that showed statistically significant results (P < 10−7) under different true underlying genetic models (dominant in A; log additive in B; and recessive in C) with heterogeneity (τ2 = 0.05). Models of analysis: dominant = squares; log additive (per allele risk) = triangles; and recessive = circles.

References

    1. McCarthy MI, Abecasis GR, Cardon LR, et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet. 2008;9(5):356–369. - PubMed
    1. Zeggini E, Scott LJ, Saxena R, et al. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nat Genet. 2008;40(5):638–645. - PMC - PubMed
    1. Zeggini E, Weedon MN, Lindgren CM, et al. Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science. 2007;316(5829):1336–1341. - PMC - PubMed
    1. Seminara D, Khoury MJ, O'Brien TR, et al. The emergence of networks in human genome epidemiology: challenges and opportunities. Epidemiology. 2007;18(1):1–8. - PubMed
    1. Weedon MN, Lango H, Lindgren CM, et al. Genome-wide association analysis identifies 20 loci that influence adult height. Nat Genet. 2008;40(5):575–583. - PMC - PubMed

Publication types