Genomic inflation factors under polygenic inheritance

Affiliations

PMID: 21407268
PMCID: PMC3137506
DOI: 10.1038/ejhg.2011.39

Genomic inflation factors under polygenic inheritance

Jian Yang et al. Eur J Hum Genet. 2011 Jul.

. 2011 Jul;19(7):807-12.

doi: 10.1038/ejhg.2011.39. Epub 2011 Mar 16.

Affiliation

¹ Queensland Statistical Genetics Laboratory, Queensland Institute of Medical Research, Brisbane, Queensland, Australia. jian.yang@qimr.edu.au

PMID: 21407268
PMCID: PMC3137506
DOI: 10.1038/ejhg.2011.39

Abstract

Population structure, including population stratification and cryptic relatedness, can cause spurious associations in genome-wide association studies (GWAS). Usually, the scaled median or mean test statistic for association calculated from multiple single-nucleotide-polymorphisms across the genome is used to assess such effects, and 'genomic control' can be applied subsequently to adjust test statistics at individual loci by a genomic inflation factor. Published GWAS have clearly shown that there are many loci underlying genetic variation for a wide range of complex diseases and traits, implying that a substantial proportion of the genome should show inflation of the test statistic. Here, we show by theory, simulation and analysis of data that in the absence of population structure and other technical artefacts, but in the presence of polygenic inheritance, substantial genomic inflation is expected. Its magnitude depends on sample size, heritability, linkage disequilibrium structure and the number of causal variants. Our predictions are consistent with empirical observations on height in independent samples of ~4000 and ~133,000 individuals.

PubMed Disclaimer

Figures

**Figure 1**
Genomic inflation factor observed in simulation *versus* that predicted by theory. Data are simulated based on real genotypes of 3925 individuals and 294 831 SNPs with different numbers of causal variants (m=1, 10, 50, 100, 500 and 1000) and heritabilities (h²=0.2, 0.4 and 0.8). Each column represents the average of λ_mean (a and c) or λ_median (b and d) observed from 100 simulations. Error bars are SD. Each marked line represents the predicted λ_mean or λ_median averaged over 100 prediction replicates given m and h². For case–control studies (c and d), h² refers to heritability of liability on the underlying scale.

**Figure 2**
Genomic inflation factor for pruned (or selected) SNPs in simulation study. GWAS for quantitative trait is simulated based on real genotypes of 3925 individuals and 294 831 SNPs with heritability of 0.8 and with different numbers of causal variants (10, 50, 100, 500 and 1000). Each column represents an average of λ_mean (b, d and f) or λ_median (a, c and e) observed from 100 simulations. Error bars are SD. In (a and b), SNPs are pruned for LD using PLINK²² with threshold r² value of 0.1, 0.3, 0.5 and 0.7. In (c and d), SNPs are pruned based on physical distance so that any pair of SNPs are at least 1 Mb away from each other. In (e and f), 10, 30, 50 and 70% SNPs are randomly sampled from all of the SNPs.

**Figure 3**
Quantile–quantile plot of height association result for QIMR data set (3925 unrelated individuals and 294 831 SNPs). All the SNPs passed stringent quality control and all the individuals are of European ancestry as verified by SNP data. The mean and median of χ²-statistics are 1.035 and 1.029, respectively.

**Figure 4**
Histograms of (a) number of SNPs in significant LD with a ‘causal variant' and (b) average r² between these SNPs and the ‘causal variant'. The ‘causal variants' are mimicked by randomly sampling (without replacement) 100 000 out of 294 831 SNPs across the genome. Simple regression is used to test for SNPs in LD with each ‘causal variant' within 5-Mb distance in either direction.

**Figure 5**
Predicted median of χ²-statistics (λ_median) of height association study in (a) the QIMR data and (b) the GIANT meta-analysis. Each column is mean±2SD of 25 prediction replicates. The straight lines are the observed λ_median in real data analyses.

**Figure 6**
Predicted genomic inflation factor for quantitative trait (a and b) and case–control (c and d) association studies. Prediction is based on 294 831 SNPs with different numbers of causal variants and heritabilities (h²), sample size (N) and disease prevalences (K, for case–control study). Each value is an average over 100 prediction replicates. For the case–control study, the number of cases and controls is equal.

**Figure 7**
Genomic inflation factor for ∼2.2-M SNPs (with exclusion of ∼636K with effective sample sizes <126 000 from the total ∼2.8 M SNPs) in GIANT meta-analysis for height with ∼133 000 samples. A total of 318 top hits were identified by GIANT meta-analysis (genome-wide false discovery rate of 0.05). Any SNP within d Mb distance (d=0.5, 1, …, or 5, x-axis) of the top hits is removed and genomic inflation factor is calculated using all of the remaining SNPs.

See this image and copyright information in PMC

References

1. Hindorff LA, Sethupathy P, Junkins HA, et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA. 2009;106:9362–9367. - PMC - PubMed
1. Maher B. Personal genomes: the case of the missing heritability. Nature. 2008;456:18–21. - PubMed
1. Manolio TA, Collins FS, Cox NJ, et al. Finding the missing heritability of complex diseases. Nature. 2009;461:747–753. - PMC - PubMed
1. Speliotes EK, Willer CJ, Berndt SI, et al. Association analyses of 249 796 individuals reveal 18 new loci associated with body mass index. Nat Genet. 2010;42:937–948. - PMC - PubMed
1. Lango Allen H, Estrada K, Lettre G, et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature. 2010;467:832–838. - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Genomic inflation factors under polygenic inheritance

Affiliation

Genomic inflation factors under polygenic inheritance

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources