Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Dec 13;342(6164):1367-72.
doi: 10.1126/science.1243490.

Exonic transcription factor binding directs codon choice and affects protein evolution

Affiliations

Exonic transcription factor binding directs codon choice and affects protein evolution

Andrew B Stergachis et al. Science. .

Abstract

Genomes contain both a genetic code specifying amino acids and a regulatory code specifying transcription factor (TF) recognition sequences. We used genomic deoxyribonuclease I footprinting to map nucleotide resolution TF occupancy across the human exome in 81 diverse cell types. We found that ~15% of human codons are dual-use codons ("duons") that simultaneously specify both amino acids and TF recognition sites. Duons are highly conserved and have shaped protein evolution, and TF-imposed constraint appears to be a major driver of codon usage bias. Conversely, the regulatory code has been selectively depleted of TFs that recognize stop codons. More than 17% of single-nucleotide variants within duons directly alter TF binding. Pervasive dual encoding of amino acid and regulatory information appears to be a fundamental feature of genome evolution.

PubMed Disclaimer

Figures

Figure 1
Figure 1. TFs densely populate and evolutionarily constrain protein-coding exons
(A) Distribution of DNaseI footprints. (B) Per-nucleotide DNaseI cleavage and ChIP-seq signal for coding CTCF (left) and NRSF (right) binding elements. (C) Proportion of coding bases within DNaseI footprints in each of 81 cell types (left), or any cell type (right). (D) Average footprint density within first, internal, or final coding exons (mean +/− SEM; p-value, paired t-test, n.s.: p-value> 0.1). (E) PhyloP conservation at 4FDBs within and outside footprints. (F) Estimated mutational age at all (grey), synonymous (brown) and nonsynonymous (red) coding SNVs (European) within and outside footprints (p-values per (21)) (G) Structure of DNA-bound KLF4 vs. average per-nucleotide DNaseI cleavage and evolutionary constraint at KLF4 footprints. (H) Average per-nucleotide conservation at 4FDBs (brown) and NDBs (red) overlapping KLF4 (left) and NFIC (right) footprints. (r = Pearson correlation, conservation at promoter bases vs. 4FDBs (top) or NDBs (bottom)). (I) Evolutionary constraint imparted by 63 TFs at promoter elements, 4FDBs and NDBs (Pearson correlations).
Figure 2
Figure 2. Transcription factors modulate global codon biases
(A) Proportions of all codons (grey), or codons outside of (yellow), or within (purple) footprints, that encode asparagine (top) or leucine (bottom). Note that codons with bias (AAC for asparagine and CTG for leucine) preferentially localize within footprints. (B) Preferential footprinting of biased codons, calculated as in (A) (p-values, Pearson's chi-squared test). (C) Preferential footprinting of each codon trinucleotide in coding vs non-coding regions (C = coding, NC = non-coding). (D) Difference in average evolutionary constraint at 3rd positions of biased codons outside vs. within footprints (p-values, Mann-Whitney test). (E) Proportions of amino acids encoded by CpG-containing codons among all codons (grey), codons outside footprints (yellow), or codons within footprints (purple)
Figure 3
Figure 3. TFs exploit and avoid specific coding features
(A) Percentage of TF motifs occupied in coding vs. non-coding regions (p-values, paired t-test). (B) Density of NFYA (left), AP2 (middle) and SP1 (right) footprints relative to translated region of first coding exons. (C) (top) Density of YY1 footprints across first coding exons. (bottom) YY1 recognition sequence and corresponding amino acid sequence within YY1 footprints overlapping start codons. (D) (top left and bottom) For NRSF as per (C). (right, arrow) Protein domain annotation of first exon third-frame NRSF footprints vs. SP1 footprints. (E) TF preference (avoidance) of stop codon trinucleotides within vs. outside footprints in non-coding regions (p-values, Pearson's chi-squared test).
Figure 4
Figure 4. Genetic variation in duons frequently alters TF occupancy
(A) Proportion of coding footprints overlapping a SNV in any of 81 cell-types. (B) Proportion of SNVs in duons that allelically alter TF occupancy. (C) (top) Per-nucleotide DNaseI cleavage at common nonsynonymous G→A SNV (rs8110393) in G/G and A/A homozygous cells. (bottom) Allelic SP1 occupancy in heterozygous (G/A) cells. (D) Proportion of synonymous and nonsynonymous variants in duons that allelically alter TF occupancy. (E–F) Proportion of nonsynonymous variants from (D) grouped by predicted impact of coding variant on protein function using (E) SIFT or (F) Polyphen-2. Note that none of the bins are significantly different (Fisher's exact test; n.s. indicates p-value > 0.1).

Comment in

References

    1. Grantham R, Gautier C, Gouy M, Mercier R, Pavé A. Nucleic acids research. 1980;8:r49–r62. - PMC - PubMed
    1. Ikemura T. Journal of molecular biology. 1981;151:389–409. - PubMed
    1. Grantham R, Gautier C, Gouy M, Jacobzone M, Mercier R. Nucleic acids research. 1981;9:r43–74. - PMC - PubMed
    1. Gouy M, Gautier C. Nucleic acids research. 1982;10:7055–74. - PMC - PubMed
    1. Eyre-Walker A, Bulmer M. Nucleic acids research. 1993;21:4599–603. - PMC - PubMed

Publication types