Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Oct;25(10):1546-57.
doi: 10.1101/gr.190546.115. Epub 2015 Jul 30.

The frequent evolutionary birth and death of functional promoters in mouse and human

Affiliations

The frequent evolutionary birth and death of functional promoters in mouse and human

Robert S Young et al. Genome Res. 2015 Oct.

Abstract

Promoters are central to the regulation of gene expression. Changes in gene regulation are thought to underlie much of the adaptive diversification between species and phenotypic variation within populations. In contrast to earlier work emphasizing the importance of enhancer evolution and subtle sequence changes at promoters, we show that dramatic changes such as the complete gain and loss (collectively, turnover) of functional promoters are common. Using quantitative measures of transcription initiation in both humans and mice across 52 matched tissues, we discriminate promoter sequence gains from losses and resolve the lineage of changes. We also identify expression divergence and functional turnover between orthologous promoters, finding only the latter is associated with local sequence changes. Promoter turnover has occurred at the majority (>56%) of protein-coding genes since humans and mice diverged. Tissue-restricted promoters are the most evolutionarily volatile where retrotransposition is an important, but not the sole, source of innovation. There is considerable heterogeneity of turnover rates between promoters in different tissues, but the consistency of these in both lineages suggests that the same biological systems are similarly inclined to transcriptional rewiring. The genes affected by promoter turnover show evidence of adaptive evolution. In mice, promoters are primarily lost through deletion of the promoter containing sequence, whereas in humans, many promoters appear to be gradually decaying with weak transcriptional output and relaxed selective constraint. Our results suggest that promoter gain and loss is an important process in the evolutionary rewiring of gene regulation and may be a significant source of phenotypic diversification.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Evolutionary outcomes of human and mouse promoters. Horse is shown here as the example outgroup species, although promoters are identified as being present ancestrally if they are found in at least one, but not all, outgroup species (see Methods). (A,B) Example promoter insertions and deletions. Gene models supported by the CAGE promoters are shown in the blue boxes, where closed boxes represent coding exons and empty boxes noncoding exons. The histograms in red describe the log2-transformed expression level of the annotated promoters. Orthologous sequence identified between species is highlighted by the green boxes between these sequences. (A) Promoter insertion at the SRP19 locus in the human lineage. Promoters 1 and 2 are conserved, while promoter 3 has been inserted in the human lineage. (B) Promoter deletion at the Col9a3 locus in the mouse lineage. Promoters 1 and 3 are conserved, promoter 2 has been deleted in the mouse lineage, and promoter 4 has experienced expression turnover between human and mouse. (C) Schematic diagram showing each possible evolutionary fate of a human promoter. Promoters are denoted by the black arrows in human, where the blue triangle shows a recently inserted promoter in the human lineage and the purple triangle shows a recently deleted promoter in the mouse lineage. Aligned (black horizontal lines) promoters can show either matched (green arrow) or diminished (yellow arrow) expression in mouse. A human promoter which has completely lost its promoter ability in mouse is shown by the black cross. (D) Frequencies of inserted, deleted, aligned but no promoter activity (orange circles), or conserved (matched, divergent, and diminished) promoters in human and mouse. The lack of tissue-matched CAGE data from an outgroup species prevented us from assigning these expression changes to a specific lineage, so these events can only be classed as expression turnovers between human and mouse. The yellow segments in the conserved promoters show the proportion of promoters with diminished expression in the opposite species. (E) Maximum expression values for promoters with each evolutionary outcome as described and quantified in D in human (left panel) and mouse (right panel). (F) Proportion of promoters displaying each evolutionary outcome in human and mouse. Samples are ordered by rank of human:mouse average promoter count per sample. The white line denotes the number of promoters with that tissue bias or expression profile (right axis), and the frequencies of each evolutionary outcome for each tissue bias or expression profile are detailed in Supplemental Table 2. Tissues used in subsequent groupings (reproductive, blue; brain, orange; immunity, yellow) or mentioned directly in the text (liver) are labeled individually. This figure is reproduced as Supplemental Figure 1, where all tissues are labeled.
Figure 2.
Figure 2.
Expression turnover at aligned promoters. (A) The percentage of human promoters of a particular class and expression profile which can be aligned to mouse but show no transcriptional activity at the aligned position. The error bars represent the 95% confidence interval from 1000 samplings of the data with replacement. (B,C) Mean GERP conservation scores in 50-bp windows around human protein-coding promoters with different evolutionary outcomes. Gray lines indicate the GERP scores for genome permuted intervals. The standard error of these mean scores is negligible and not visible on this scale. The direction of transcription is shown by the black arrows. The sample sizes of promoters contributing to each line are detailed in Supplemental Table 3.
Figure 3.
Figure 3.
Recent promoter insertions and deletions in the human and mouse lineages. (A,B) The percentage of promoters of a particular class and expression profile which have been recently inserted (A) or deleted (B) in the human and mouse lineages. The closed diamonds represent broadly expressed promoters, while open diamonds show results for tissue-restricted promoters. The numbers of promoters in each category are shown in parentheses next to these points. The error bars represent the 95% confidence interval from 1000 samplings of the data with replacement. The gray bar shows the same 95% confidence interval for genome permuted intervals. The dashed line describes the mean of this expected distribution. (C,D) Percentage of promoters with tissue-biased expression that were inserted (C) or deleted (D), subdivided by biased tissue expression, where the number of samples for each tissue (described in Fig. 1F) is shown in parentheses. The gray bars show the 95% confidence interval for genome permuted intervals for each promoter class, while the dashed line shows the mean of this distribution.
Figure 4.
Figure 4.
Derived allele frequencies in promoters of different evolutionary outcomes. (AC) Odds ratios of derived allele frequencies for rare (<1.5%) and nonrare (>5%) derived alleles compared between the genome-wide distribution and the tested sequence category as labeled. Odds ratios of 1.0 indicate equality with the genome-wide distribution, higher values indicate relative selective constraint, and values <1 are indicative of net positive selection. Odds ratios for single nucleotide polymorphisms (SNPs) at the 2nd codon position, fourfold-degenerate sites and within all protein-coding sequence are shown in gray as points of reference for comparison. The numbers of informative SNPs overlapping each category are shown in parentheses next to the axis labels. (D) Derived allele frequency odds ratios for promoters with matched expression between species and different expression profiles and tissue biases. As in AC, odds ratios for SNPs at the 2nd codon position, fourfold-degenerate sites, and within all protein-coding sequence are shown in in gray. The numbers of SNPs overlapping each category are shown in parentheses next to the axis labels.
Figure 5.
Figure 5.
Promoter insertions frequently contain repetitive elements. (A) Enrichment of repetitive elements across recently inserted human promoters relative to the genome-wide expectation for insertions across promoter classes and expression profiles. The 95% confidence interval for genome permuted intervals is shown in gray, and the direction of transcription is shown by the arrows. The numbers of promoters which contribute to each enrichment are shown in the corresponding histograms in B. (B) Frequency of repetitive element families across recently inserted human promoters of the expression profiles, as in A.
Figure 6.
Figure 6.
Compensatory promoter turnover and positive selection. (A) Human, mouse, and horse alignments at the PDE4C locus. Four promoters are shown, which are conserved (promoters 1 and 3), human-deleted (promoter 2), or mouse-deleted (promoter 4). Gene models supported by the CAGE promoters are shown in the blue boxes, where solid boxes represent coding exons and empty boxes noncoding exons. The histograms in red describe the log2-transformed expression level of the annotated promoters. Orthologous sequence identified between species is highlighted by the green boxes between these sequences. (B) Frequencies of 1:1 orthologous genes in human and mouse categorized by the type of promoter sequence turnover events. The blue circles represent genes with a greater proportion of promoter births than deaths, while the purple circles similarly represent genes with a greater proportion of promoter deaths. Genes with an equal number of promoter births and deaths are shown in the yellow circles. All genes are shown in the outer circles, while the numbers in the inner circles shows those with evidence for compensatory promoter turnovers. Genes with only expression turnover at their promoters are shown in the orange segment, while the remainder of the green circle indicates the number of genes with a conserved promoter architecture (C,D). Enrichments of human orthologous genes with different turnover events and expression profiles relative to genes with a conserved promoter architecture. χ2 test, (*) P < 0.05, (**) P < 0.01, (***) P < 0.001. (E) Enrichments of orthologous genes with coding sequence positive selection. Genes are classified by the possible different evolutionary outcomes of their associated promoters relative to genes with a conserved promoter architecture. χ2 test, (*) P < 0.05, (**) P < 0.01, (***) P < 0.001.

References

    1. Anderson E, Hill RE. 2014. Long range regulation of the sonic hedgehog gene. Curr Opin Genet Dev 27C: 54–59. - PubMed
    1. Ballester B, Medina-Rivera A, Schmidt D, Gonzalez-Porta M, Carlucci M, Chen X, Chessman K, Faure AJ, Funnell AP, Goncalves A, et al. 2014. Multi-species, multi-transcription factor binding highlights conserved control of tissue-specific biological pathways. Elife 3: e02626. - PMC - PubMed
    1. Bejerano G, Lowe CB, Ahituv N, King B, Siepel A, Salama SR, Rubin EM, Kent WJ, Haussler D. 2006. A distal enhancer and an ultraconserved exon are derived from a novel retroposon. Nature 441: 87–90. - PubMed
    1. Bozek K, Wei Y, Yan Z, Liu X, Xiong J, Sugimoto M, Tomita M, Paabo S, Pieszek R, Sherwood CC, et al. 2014. Exceptional evolutionary divergence of human muscle and brain metabolomes parallels human cognitive and physical uniqueness. PLoS Biol 12: e1001871. - PMC - PubMed
    1. Brawand D, Soumillon M, Necsulea A, Julien P, Csardi G, Harrigan P, Weier M, Liechti A, Aximu-Petri A, Kircher M, et al. 2011. The evolution of gene expression levels in mammalian organs. Nature 478: 343–348. - PubMed

Publication types

LinkOut - more resources