Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 May;3(5):e99.
doi: 10.1371/journal.pcbi.0030099. Epub 2007 Apr 19.

Frequent gain and loss of functional transcription factor binding sites

Affiliations

Frequent gain and loss of functional transcription factor binding sites

Scott W Doniger et al. PLoS Comput Biol. 2007 May.

Erratum in

Abstract

Cis-regulatory sequences are not always conserved across species. Divergence within cis-regulatory sequences may result from the evolution of species-specific patterns of gene expression or the flexible nature of the cis-regulatory code. The identification of functional divergence in cis-regulatory sequences is therefore important for both understanding the role of gene regulation in evolution and annotating regulatory elements. We have developed an evolutionary model to detect the loss of constraint on individual transcription factor binding sites (TFBSs). We find that a significant fraction of functionally constrained binding sites have been lost in a lineage-specific manner among three closely related yeast species. Binding site loss has previously been explained by turnover, where the concurrent gain and loss of a binding site maintains gene regulation. We estimate that nearly half of all loss events cannot be explained by binding site turnover. Recreating the mutations that led to binding site loss confirms that these sequence changes affect gene expression in some cases. We also estimate that there is a high rate of binding site gain, as more than half of experimentally identified S. cerevisiae binding sites are not conserved across species. The frequent gain and loss of TFBSs implies that cis-regulatory sequences are labile and, in the absence of turnover, may contribute to species-specific patterns of gene expression.

PubMed Disclaimer

Conflict of interest statement

Competing interests. The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Evolutionary Models for Transcription Factor Binding Sites
Three different evolutionary models are considered in this study: a neutral model of evolution, which assumes no functional constraint (A), a conserved TFBS model, which uses site-specific substitution matrices representing the varying constraints on each nucleotide position of a binding site (B), and a semiconserved model, which combines the neutral and TFBS models to identify sequences showing loss of constraint, indicated by the asterisk, (C). We also considered the case of loss in combination with gain, i.e., turnover (D), where the loss of an ancestral binding site (black oval) is accompanied by the gain of a compensatory binding site (red oval).
Figure 2
Figure 2. Identifying Conserved and Semiconserved Binding Sites
(A) The distribution of posterior probabilities for 2,000 putative Rox1 binding sites present in yeast intergenic sequences. (B) The distribution of 2,000 Rox1 sites simulated under a neutral model (red) or a conserved binding site model (blue) as shown. (C) The distribution of 2,000 Rox1 sites simulated under a semiconserved model, where loss of constraint occurred at a random location on the phylogenetic tree, excluding the outgroup. The Log2 posterior probability of the neutral model is plotted on the x-axis, the posterior probability of the conserved model is plotted on the y-axis. Since the three probabilities sum to one, p(semiconserved | data) = 1 − xy. Conserved and semiconserved sites were classified by three cutoffs (lines), defined in the text, and determined by the simulations. Sites passing cutoff one and two are annotated as conserved. Sites passing cutoff one and three are annotated as semiconserved. The three sites tested experimentally are shown in pink.
Figure 3
Figure 3. Distribution of Binding Site Scores from Neutral, Conserved, and Semiconserved Sites for 91 Binding Site Models
We use the log-odds score of a sequence given a PWM relative to the genome-wide nucleotide frequencies as a proxy for binding energy. The semiconserved category (black bars) only includes sites from species where functional constraint has been maintained. The loss category (diagonally striped bars) shows sites from species where functional constraint has been lost. The neutral category (grey) shows sites generated by neutral simulations.
Figure 4
Figure 4. Semiconserved Binding Sites That Were Tested Using Gene Expression Assays
The sequence logo representing the PWM and the alignment of each semiconserved binding site are shown for Rox1 (A), Ndt80 (B), and Msn2/4 (C). The binding site in S. cerevisiae is outlined in grey. The sequence changes shown in red were made in the S. cerevisiae promoter to test the predictions of the semiconserved binding site model.

References

    1. Wray GA, Hahn MW, Abouheif E, Balhoff JP, Pizer M, et al. The evolution of transcriptional regulation in eukaryotes. Mol Biol Evol. 2003;20:1377–1419. - PubMed
    1. Whitehead A, Crawford DL. Variation within and among species in gene expression: Raw material for evolution. Mol Ecol. 2006;15:1197–1211. - PubMed
    1. Maroni G, Laurie-Ahlberg CC. Genetic control of Adh expression in Drosophila melanogaster . Genetics. 1983;105:921–933. - PMC - PubMed
    1. Schadt EE, Monks SA, Drake TA, Lusis AJ, Che N, et al. Genetics of gene expression surveyed in maize, mouse, and man. Nature. 2003;422:297–302. - PubMed
    1. Brem RB, Yvert G, Clinton R, Kruglyak L. Genetic dissection of transcriptional regulation in budding yeast. Science. 2002;296:752–755. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources