Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul 31;8(8):2817-2824.
doi: 10.1534/g3.118.200513.

On the Relationship Between High-Order Linkage Disequilibrium and Epistasis

Affiliations

On the Relationship Between High-Order Linkage Disequilibrium and Epistasis

Yanjun Zan et al. G3 (Bethesda). .

Abstract

A plausible explanation for statistical epistasis revealed in genome wide association analyses is the presence of high order linkage disequilibrium (LD) between the genotyped markers tested for interactions and unobserved functional polymorphisms. Based on findings in experimental data, it has been suggested that high order LD might be a common explanation for statistical epistasis inferred between local polymorphisms in the same genomic region. Here, we empirically evaluate how prevalent high order LD is between local, as well as distal, polymorphisms in the genome. This could provide insights into whether we should account for this when interpreting results from genome wide scans for statistical epistasis. An extensive and strong genome wide high order LD was revealed between pairs of markers on the high density 250k SNP-chip and individual markers revealed by whole genome sequencing in the Arabidopsis thaliana 1001-genomes collection. The high order LD was found to be more prevalent in smaller populations, but present also in samples including several hundred individuals. An empirical example illustrates that high order LD might be an even greater challenge in cases when the genetic architecture is more complex than the common assumption of bi-allelic loci. The example shows how significant statistical epistasis is detected for a pair of markers in high order LD with a complex multi allelic locus. Overall, our study illustrates the importance of considering also other explanations than functional genetic interactions when genome wide statistical epistasis is detected, in particular when the results are obtained in small populations of inbred individuals.

Keywords: Arabidopsis thaliana; epistasis; high order linkage disequilibrium; leaf; molybdenum.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Illustration of how the pseudomarkers (P1, P2, P3, P4) used in the estimation of the second order linkage disequilibrium between a pair of linked or unlinked markers (predictors; M1 and M2,) and a third linked or unlinked functional polymorphism (target; Q) are created. The pseudomarkers together represent the possible bi-allelic formulations of the two-locus M1M2 genotypes. The maximum pairwise LD-r2 between the target and the four pseudomarkers (P4) defines the second order LD between the predictors (M1, M2) and the target (Q). The general two-locus epistatic model (Model 1 below) will, when fitted to the genotypes of the predictors (M1, M2) capture the variance of the target (Q).
Figure 2
Figure 2
Illustration of how the prevalence of high order LD-r2 to the targets in a 6Mb window on A. thaliana chromosome 2 (8 – 14Mb) depends on distance of the predictors from the target. The color gradient illustrates the proportion of predictor pairs that reach a particular LD-r2 (x-axis) depending on the distance between the nearest predictor and the target (y-axis). Results are presented for populations with n = 100 (A) and n = 728 (B) individuals.
Figure 3
Figure 3
Strong second order LD-r2 exists also when the individual predictor to target LD-r2 is weak. The intensity of each dot illustrates the number of cases with a particular high order LD-r2 / maximum individual predictor to target LD-r2 combination. Dots below the line are cases where the high order LD-r2 stronger than any individual predictor to target LD-r2 (n = 728).
Figure 4
Figure 4
Number of predictor pairs of different classes in strong high order LD-r2 (>0.6) to targets detected in the evaluated windows and estimated genome wide. The distribution of LD-r2 values > 0.6 for the cis-cis, cis-trans, trans-trans predictor pairs for (A; n = 100) and (B; n = 728) The total number of predictor pairs with high order LD-r2 above 0.6 in the three classes are summarized in (C) and used to estimate the total expected number of predictor pairs in the entire genome (D; error bars show the estimation error estimated from the results obtained for the three window (Materials and Methods).
Figure 5
Figure 5
An illustration of how the high order LD between four polymorphisms affecting the level of molybdenum in the A. thaliana leaf (Forsberg et al. 2015), likely explains the significant statistical epistasis detected for a cis-trans predictor pair. (A) Boxplots illustrating the phenotypic distribution in the four genotype classes defined by the cis-trans predictor pair with the strongest significant epistatic interaction to the level of molybdenum in the A. thaliana leaf. (B) Illustration of the connection between the two-locus genotypes of the predictor pair and the minor alleles at the four linked loci associated with this trait on chromosome 2 (Forsberg et al. 2015). The top box in (B) illustrates the two-locus genotype for the predictor pair, with the width of each sub-box indicating the number of individuals in each genotype class in the population. In the bottom box in (B), each individual is represented as a column, where green (molybdenum decreasing) and orange (Molybdenum increasing) colors indicates that the individual carry the minor alleles at the four loci identified in (Forsberg et al. 2015). mGWA1 and mGWA2 are SNP markers associated with the trait and 53del and 326 are structural polymorphisms (Forsberg et al. 2015).

References

    1. Alonso-Blanco C., Andrade J., Becker C., Bemm F., Bergelson J., et al. , 2016. 1,135 Genomes Reveal the Global Pattern of Polymorphism in Arabidopsis thaliana. Cell 166: 481–491. 10.1016/j.cell.2016.05.063 - DOI - PMC - PubMed
    1. Alvarez-Castro J. M., Carlborg O., 2007. A unified model for functional and statistical epistasis and its application in quantitative trait loci analysis. Genetics 176: 1151–1167. 10.1534/genetics.106.067348 - DOI - PMC - PubMed
    1. Anholt R. R. H., Dilda C. L., Chang S., Fanara J.-J., Kulkarni N. H., et al. , 2003. The genetic architecture of odor-guided behavior in Drosophila: epistasis and the transcriptome. Nat. Genet. 35: 180–184. 10.1038/ng1240 - DOI - PubMed
    1. Atwell S., Huang Y. S., Vilhjálmsson B. J., Willems G., Horton M., et al. , 2010. Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature 465: 627–631. 10.1038/nature08800 - DOI - PMC - PubMed
    1. Baxter I., Brazelton J. N., Yu D., Huang Y. S., Lahner B., et al. , 2010. A coastal cline in sodium accumulation in Arabidopsis thaliana is driven by natural variation of the sodium transporter AtHKT1;1. PLoS Genet. 6: e1001193 10.1371/journal.pgen.1001193 - DOI - PMC - PubMed

Publication types

Substances