Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Dec;17(12):1591-606.
doi: 10.1089/cmb.2010.0084.

Studying the evolution of promoter sequences: a waiting time problem

Affiliations

Studying the evolution of promoter sequences: a waiting time problem

Sarah Behrens et al. J Comput Biol. 2010 Dec.

Abstract

To gain a better understanding of the evolutionary dynamics of regulatory DNA sequences, we address the following questions: (1) How long does it take until a given transcription factor (TF) binding site emerges at random in a promoter sequence? and (2) How does the composition of a TF binding site affect this waiting time? Using two different probabilistic models (an i.i.d. model and a neighbor dependent model), we can compute the expected waiting time for every k-mer, k ranging from 5 to 10, until it appears in a promoter of a species. Our findings indicate that new TF binding sites can be created on a short evolutionary time scale, i.e. in a time span below the speciation time of human and chimp. Furthermore, one can conclude that the composition of a TF binding site plays a crucial role concerning the waiting time until it appears and that the CpG methylation-deamination substitution process probably accelerates the creation of new TF binding sites. A screening of existing TF binding sites moreover reveals that k-mers predicted to have short waiting times occur more frequently than others. Supplementary Material is available at www.libertonline.com/cmb .

PubMed Disclaimer

Figures

FIG. 1.
FIG. 1.
Minimal, maximal, and average waiting times in model M1 (log scale). These waiting times (generations) are computed based on the results in Table 2.
FIG. 2.
FIG. 2.
Histograms of the waiting times in model M1. The expected waiting times (generations) are taken from Table 2.
FIG. 3.
FIG. 3.
Waiting times in dependency of the number of promoters in model M1 (log scale). Minimal, maximal, and average waiting times (generations) for 5- and 10-mers to appear in at least one of several promoters.
FIG. 4.
FIG. 4.
Example. Assuming that the SP1 motif is the set of 10-mers (and their reverse complements) with a score of at least 95% of the maximal score, we can derive the ranks for this 10-mer set, i.e., the ranks among all 10-mers in ascending order according to their waiting time until emergence and normalize them.
FIG. 5.
FIG. 5.
Histogram of the relative ranks of k-mers contained in JASPAR PCMs. For all JASPAR matrices of length k, 5 ≤ k ≤ 10, we assigned relative ranks to the k-mers with a relative score threshold of 0.95 (according to the procedure illustrated in Fig. 4). The horizontal line represents the uniform case, i.e., the case where the relative ranks would be distributed uniformly.
FIG. 6.
FIG. 6.
Histogram of the minimal relative ranks of JASPAR TFs. After having assigned relative ranks to the k-mers contained in JASPAR matrices (see Fig. 5), we determined the smallest relative rank for every TF. Thus, this figure depicts JASPAR TFs ranked according to their waiting time until appearance according to our model.

Similar articles

Cited by

References

    1. Arndt P.F. Burge C.B. Hwa T. DNA sequence evolution with neighbor-dependent mutation. J. Comput. Biol. 2003;10:313–322. - PubMed
    1. Arndt P.F. Hwa T. Identification and measurement of neighbor-dependent nucleotide substitution processes. Bioinformatics. 2005;21:2322–2328. - PubMed
    1. Chaurasia G. Iqbal Y. Hänig C., et al. UniHI: an entry gate to the human protein interactome. Nucleic Acids Res. 2007:35. - PMC - PubMed
    1. Duret L. Arndt P.F. The impact of recombination on nucleotide substitutions in the human genome. PLoS Genet. 2008:4. - PMC - PubMed
    1. Durrett R. Schmidt D. Waiting for regulatory sequences to appear. Annu. Appl. Probab. 2007;17:1–32.

Substances

LinkOut - more resources