Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jan 8;11(1):e1004857.
doi: 10.1371/journal.pgen.1004857. eCollection 2015 Jan.

The genetic and mechanistic basis for variation in gene regulation

Affiliations

The genetic and mechanistic basis for variation in gene regulation

Athma A Pai et al. PLoS Genet. .

Abstract

It is now well established that noncoding regulatory variants play a central role in the genetics of common diseases and in evolution. However, until recently, we have known little about the mechanisms by which most regulatory variants act. For instance, what types of functional elements in DNA, RNA, or proteins are most often affected by regulatory variants? Which stages of gene regulation are typically altered? How can we predict which variants are most likely to impact regulation in a given cell type? Recent studies, in many cases using quantitative trait loci (QTL)-mapping approaches in cell lines or tissue samples, have provided us with considerable insight into the properties of genetic loci that have regulatory roles. Such studies have uncovered novel biochemical regulatory interactions and led to the identification of previously unrecognized regulatory mechanisms. We have learned that genetic variation is often directly associated with variation in regulatory activities (namely, we can map regulatory QTLs, not just expression QTLs [eQTLs]), and we have taken the first steps towards understanding the causal order of regulatory events (for example, the role of pioneer transcription factors). Yet, in most cases, we still do not know how to interpret overlapping combinations of regulatory interactions, and we are still far from being able to predict how variation in regulatory mechanisms is propagated through a chain of interactions to eventually result in changes in gene expression profiles.

PubMed Disclaimer

Conflict of interest statement

I have read the journal's policy and have the following conflicts: JKP is on the scientific advisory boards for 23andMe and DNANexus with stock options.

Figures

Figure 1
Figure 1. A cascade of regulatory mechanisms by which an eQTL SNP can affect gene expression.
Studies mapping regulatory QTLs have identified a variety of mechanisms, many of which are coordinated, by which eQTLs might act to affect variation in mature mRNA levels. First, eQTL SNPs can impact epigenetic modifications and transcription initiation. These include regulatory processes such as transcription factor binding, histone modifications, enhancer activity (perhaps mediated by chromatin architecture and conformation), and DNA methylation. Transcriptional mechanisms, and specifically transcription factor binding, are likely the strongest contributors to variation in steady-state mRNA levels. Second, recent work has increased appreciation for transcriptional and cotranscriptional processes as major contributors to variation in gene expression levels and mRNA isoform diversity. These include mechanisms such as transcriptional elongation (by PolII traveling rates), cotranscriptional splicing, and mRNA processing and modification. Third, eQTL SNPs both within and outside the transcript have been shown to influence posttranscriptional mRNA processing, which includes mechanisms such as general mRNA degradation, defects in polyadenylation, and targeting by miRNAs. Finally, preliminary studies have shown that we do not yet fully appreciate the extent to which variation in mRNA expression might impact or even correlate to variation in downstream protein products, the synthesis of which are additionally regulated by a set of posttranscriptional and translational mechanisms.
Figure 2
Figure 2. An approach for joint quantitative analysis of gene expression and regulatory QTLs.
A goal of interindividual studies of regulatory mechanisms is to understand the extent to which variation at regulatory loci underlies gene expression levels across individuals. (A) This example, using hypothetical data, shows a QTL that is associated with levels of both DNA methylation in an upstream CpG island (left) and gene expression (right). Though the example QTL shown here indicates higher DNA methylation due to a G allele (potentially in a CpG pair), SNPs associated with methylation do not necessarily always fall in CpG dinucleotides. (B) The observed correlation between DNA methylation and gene expression levels could be due to a few different underlying relationships, two of which we have highlighted here. The extent to which gene expression and regulatory differences are correlated through an intermediate variable is often tested using an approach called partial correlation analysis. This involves regressing out the effects of an intermediate variable—genotype in this example—from both DNA methylation and gene expression levels and then evaluating the residual correlation between the two variables (left). One possibility is that the QTL directly affects differences in DNA methylation, which then determine (cause) the gene expression level. Thus, gene expression is regulated by the genotype through the DNA methylation effects (middle), and the residual variance in gene expression levels will still be correlated to residual DNA methylation levels. Alternatively, genotype is independently associated with both DNA methylation and gene expression levels—for instance, by directly influencing changes in an upstream mechanism (such as transcription factor binding) that affects DNA methylation and gene expression levels. This would make DNA methylation and gene expression appear to be correlated, but not causally related (right), and the residual values no longer show any significant correlation.
Figure 3
Figure 3. A representative example of a QTL in a TF binding site correlated with changes across multiple regulatory mechanisms.
Many concerted changes in regulatory mechanisms across genotypes can be linked to a sequence change in transcription factor binding sites, which might causally influence downstream changes. (A) For TFs that regulate concerted changes in transcriptional marks, SNPs that cause a large change in binding affinity (as measured by a position weight matrix score, x-axes) might also show a large skew in the ratio of transcription mark reads from each allele (measured as fraction of reads from the reference allele, y-axes). Evidence for this correlation across all SNPs in binding sites (top panel, red points) implies a relationship between TF binding and the transcriptional mark. Significant correlations can then be assessed for a given TF across multiple transcriptional marks (bottom panel, where each line represents a correlation using allelic biases measured from different histone modifications, PolII localization, etc.) to understand which mechanisms might be influenced by changes in binding of the given TF. (B) Overall, looking at allelic biases in transcriptional marks at SNPs that can affect TF binding affinity show a pattern whereby increased TF binding is promoting open chromatin (measured by DNaseI sensitivity), nucleosome positioning, and enrichment of activating histone modifications relative to sites with weaker TF binding. Importantly, since SNPs in these binding sites usually only have moderate-to-weak effects on binding affinity, QTL SNPs most likely serve to shift the equilibrium frequencies between these two configurations within populations of cells. (C) This example of a TF binding QTL shows a SNP, rs2886870, that falls within a binding site for the NF-kB transcription factor. NF-kB ChIP-seq data show that LCLs with at least one T allele (TG genotype; purple) matching the consensus motif sequence have higher NF-kB binding than LCLs with no T alleles (GG genotype; orange). The top panel shows the distribution of ChIP-seq reads in a 500-bp window around rs2886870, with the grey line representing the mean across four individuals and the colored outlines representing the 95% confidence intervals. The rs2886870 SNP also acts as a QTL for several downstream regulatory patterns, with the T allele promoting significantly increased DNaseI hypersensitivity, PolII localization, and H3K7ac marks at the site of the transcription factor site and increased PolII localization, H3K4me3 marks, and H3K7ac marks at the promoter of the downstream C3orf59 gene, whose expression is also significantly associated with this QTL (panel reproduced from McVicker et al. 2013 [25]).

References

    1. Wray GA (2007) The evolutionary significance of cis-regulatory mutations. Nat Rev Genet 8: 206–216 Available: http://bejerano.stanford.edu/readings/public/ - PubMed
    1. Carroll SB (2008) Evo-devo and an expanding evolutionary synthesis: a genetic theory of morphological evolution. Cell 134: 25–36 10.1016/j.cell.2008.06.030 - DOI - PubMed
    1. Cookson W, Liang L, Abecasis G, Moffatt M, Lathrop M (2009) Mapping complex disease traits with global gene expression. Nat Rev Genet 10: 184–194 10.1038/nrg2537 - DOI - PMC - PubMed
    1. Montgomery SB, Dermitzakis ET (2011) From expression QTLs to personalized transcriptomics. Nat Rev Genet 12: 277–282 10.1038/nrg2969 - DOI - PubMed
    1. Pickrell JK (2014) Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. Am J Hum Genet 94: 559–573 10.1016/j.ajhg.2014.03.004 - DOI - PMC - PubMed

Publication types