Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;8(7):e1002600.
doi: 10.1371/journal.pcbi.1002600. Epub 2012 Jul 5.

Predicting signatures of "synthetic associations" and "natural associations" from empirical patterns of human genetic variation

Affiliations

Predicting signatures of "synthetic associations" and "natural associations" from empirical patterns of human genetic variation

Diana Chang et al. PLoS Comput Biol. 2012.

Abstract

Genome-wide association studies (GWAS) have in recent years discovered thousands of associated markers for hundreds of phenotypes. However, associated loci often only explain a relatively small fraction of heritability and the link between association and causality has yet to be uncovered for most loci. Rare causal variants have been suggested as one scenario that may partially explain these shortcomings. Specifically, Dickson et al. recently reported simulations of rare causal variants that lead to association signals of common, tag single nucleotide polymorphisms, dubbed "synthetic associations". However, an open question is what practical implications synthetic associations have for GWAS. Here, we explore the signatures exhibited by such "synthetic associations" and their implications based on patterns of genetic variation observed in human populations, thus accounting for human evolutionary history -a force disregarded in previous simulation studies. This is made possible by human population genetic data from HapMap 3 consisting of both resequencing and array-based genotyping data for the same set of individuals from multiple populations. We report that synthetic associations tend to be further away from the underlying risk alleles compared to "natural associations" (i.e. associations due to underlying common causal variants), but to a much lesser extent than previously predicted, with both the age and the effect size of the risk allele playing a part in this phenomenon. We find that while a synthetic association has a lower probability of capturing causal variants within its linkage disequilibrium block, sequencing around the associated variant need not extend substantially to have a high probability of capturing at least one causal variant. We also show that the minor allele frequency of synthetic associations is lower than of natural associations for most, but not all, loci that we explored. Finally, we find the variance in associated allele frequency to be a potential indicator of synthetic associations.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Distance of synthetic and natural associations from the causal variant it is in greatest LD with.
Box plot of the distance between any associated SNP and causal variant it is in highest LD with, measured in r2, for (a) YRI and (b) CEU in four scenarios: 2 common causal variants with a GRR of 1.5 (dark blue), 2 common causal variants with an unrealistic GRR of 3 (light blue), 5 and 9 rare causal variants with a GRR of 3 (red and gold respectively). Distances vary greatly between the different disease loci (x-axis) as well as between populations, but in all regions the median (line within each box) is larger for rare causal variants than for common causal variants of lower effect size. Increasing the effect size can result in higher association distance as is observed most notably in region #5.
Figure 2
Figure 2. Distance of causal variant from “synthetic associations” partitioned by the age of the mutation.
Box plot similar to Figure 1, while separating rare variants in CEU and YRI into a more recent and an older class (Materials and Methods). Variants due to more recent mutations result in much increased distance between the associated SNP and the causal variant with highest LD in 3 regions in YRI and 2 regions in CEU. Results are presented for only 4 of the disease loci due to lack of relevant data in locus #1. Note that the risk allele frequency range for rare variants is narrower compared to Figure 2 (Materials and Methods) and that the y-axis scale is different between the two populations.
Figure 3
Figure 3. Resequence window size necessary to capture at least one causal variant.
The figure presents for a given window size, the fraction of tests combined over all regions with significant associations where at least one association is within the given distance from the causal variant it is in highest LD with. The colors correspond to the same scenarios as in Figure 1. Resequencing need not extend much further than in the common causal variant case, as a window of size of 0.1 cM has at least one association tagging a rare causal variant in >90% of the tests between both populations and all regions.
Figure 4
Figure 4. Minor allele frequency (MAF) of associated variants.
Box plot of the minor allele frequency for all associated variants in the different scenarios. Although synthetic associations have median MAF lower than that of natural associations, the range of MAF for synthetic associations varies across the different loci and populations. The median MAF is similar between the natural and synthetic associations for a few loci (disease locus #2 in CEU and #1 in YRI).

Similar articles

Cited by

  • High burden of private mutations due to explosive human population growth and purifying selection.
    Gao F, Keinan A. Gao F, et al. BMC Genomics. 2014;15 Suppl 4(Suppl 4):S3. doi: 10.1186/1471-2164-15-S4-S3. Epub 2014 May 20. BMC Genomics. 2014. PMID: 25056720 Free PMC article.
  • Fine-mapping the HOXB region detects common variants tagging a rare coding allele: evidence for synthetic association in prostate cancer.
    Saunders EJ, Dadaev T, Leongamornlert DA, Jugurnauth-Little S, Tymrakiewicz M, Wiklund F, Al Olama AA, Benlloch S, Neal DE, Hamdy FC, Donovan JL, Giles GG, Severi G, Gronberg H, Aly M, Haiman CA, Schumacher F, Henderson BE, Lindstrom S, Kraft P, Hunter DJ, Gapstur S, Chanock S, Berndt SI, Albanes D, Andriole G, Schleutker J, Weischer M, Nordestgaard BG, Canzian F, Campa D, Riboli E, Key TJ, Travis RC, Ingles SA, John EM, Hayes RB, Pharoah P, Khaw KT, Stanford JL, Ostrander EA, Signorello LB, Thibodeau SN, Schaid D, Maier C, Kibel AS, Cybulski C, Cannon-Albright L, Brenner H, Park JY, Kaneva R, Batra J, Clements JA, Teixeira MR, Xu J, Mikropoulos C, Goh C, Govindasami K, Guy M, Wilkinson RA, Sawyer EJ, Morgan A; COGS-CRUK GWAS-ELLIPSE (Part of GAME-ON) Initiative; UK Genetic Prostate Cancer Study Collaborators; UK ProtecT Study Collaborators; PRACTICAL Consortium; Easton DF, Muir K, Eeles RA, Kote-Jarai Z. Saunders EJ, et al. PLoS Genet. 2014 Feb 13;10(2):e1004129. doi: 10.1371/journal.pgen.1004129. eCollection 2014 Feb. PLoS Genet. 2014. PMID: 24550738 Free PMC article.
  • Network analysis identifies protein clusters of functional importance in juvenile idiopathic arthritis.
    Stevens A, Meyer S, Hanson D, Clayton P, Donn RP. Stevens A, et al. Arthritis Res Ther. 2014 May 8;16(3):R109. doi: 10.1186/ar4559. Arthritis Res Ther. 2014. PMID: 24886659 Free PMC article.
  • Whole-exome sequencing of 2,000 Danish individuals and the role of rare coding variants in type 2 diabetes.
    Lohmueller KE, Sparsø T, Li Q, Andersson E, Korneliussen T, Albrechtsen A, Banasik K, Grarup N, Hallgrimsdottir I, Kiil K, Kilpeläinen TO, Krarup NT, Pers TH, Sanchez G, Hu Y, Degiorgio M, Jørgensen T, Sandbæk A, Lauritzen T, Brunak S, Kristiansen K, Li Y, Hansen T, Wang J, Nielsen R, Pedersen O. Lohmueller KE, et al. Am J Hum Genet. 2013 Dec 5;93(6):1072-86. doi: 10.1016/j.ajhg.2013.11.005. Epub 2013 Nov 27. Am J Hum Genet. 2013. PMID: 24290377 Free PMC article.
  • Deciphering the genetic architecture of low-penetrance susceptibility to colorectal cancer.
    Whiffin N, Dobbins SE, Hosking FJ, Palles C, Tenesa A, Wang Y, Farrington SM, Jones AM, Broderick P, Campbell H, Newcomb PA, Casey G, Conti DV, Schumacher F, Gallinger S, Lindor NM, Hopper J, Jenkins M, Dunlop MG, Tomlinson IP, Houlston RS. Whiffin N, et al. Hum Mol Genet. 2013 Dec 15;22(24):5075-82. doi: 10.1093/hmg/ddt357. Epub 2013 Jul 30. Hum Mol Genet. 2013. PMID: 23904454 Free PMC article.

References

    1. Pritchard JK, Cox NJ. The allelic architecture of human disease genes: common disease-common variant…or not? Hum Mol Genet. 2002;11:2417–2423. - PubMed
    1. Iles MM. What can genome-wide association studies tell us about the genetics of common disease? PLoS Genet. 2008;4:e33. - PMC - PubMed
    1. Reich DE, Lander ES. On the allelic spectrum of human disease. Trends Genet. 2001;17:502–510. - PubMed
    1. Frazer KA, Murray SS, Schork NJ, Topol EJ. Human genetic variation and its contribution to complex traits. Nat Rev Genet. 2009;10:241–251. - PubMed
    1. Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, et al. Finding the missing heritability of complex diseases. Nature. 2009;461:747–753. - PMC - PubMed

Publication types