Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jul 24;10(7):e1004525.
doi: 10.1371/journal.pgen.1004525. eCollection 2014 Jul.

8.2% of the Human genome is constrained: variation in rates of turnover across functional element classes in the human lineage

Affiliations

8.2% of the Human genome is constrained: variation in rates of turnover across functional element classes in the human lineage

Chris M Rands et al. PLoS Genet. .

Abstract

Ten years on from the finishing of the human reference genome sequence, it remains unclear what fraction of the human genome confers function, where this sequence resides, and how much is shared with other mammalian species. When addressing these questions, functional sequence has often been equated with pan-mammalian conserved sequence. However, functional elements that are short-lived, including those contributing to species-specific biology, will not leave a footprint of long-lasting negative selection. Here, we address these issues by identifying and characterising sequence that has been constrained with respect to insertions and deletions for pairs of eutherian genomes over a range of divergences. Within noncoding sequence, we find increasing amounts of mutually constrained sequence as species pairs become more closely related, indicating that noncoding constrained sequence turns over rapidly. We estimate that half of present-day noncoding constrained sequence has been gained or lost in approximately the last 130 million years (half-life in units of divergence time, d1/2 = 0.25-0.31). While enriched with ENCODE biochemical annotations, much of the short-lived constrained sequences we identify are not detected by models optimized for wider pan-mammalian conservation. Constrained DNase 1 hypersensitivity sites, promoters and untranslated regions have been more evolutionarily stable than long noncoding RNA loci which have turned over especially rapidly. By contrast, protein coding sequence has been highly stable, with an estimated half-life of over a billion years (d1/2 = 2.1-5.0). From extrapolations we estimate that 8.2% (7.1-9.2%) of the human genome is presently subject to negative selection and thus is likely to be functional, while only 2.2% has maintained constraint in both human and mouse since these species diverged. These results reveal that the evolutionary history of the human genome has been highly dynamic, particularly for its noncoding yet biologically functional fraction.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Evolutionary turnover of constrained sequence.
A. Quantity of constrained sequence (αselIndel) estimated by NIM1 (blue bars) and NIM2 (red bars) plotted against ancestral repeat divergence for different pairs of eutherian species genomes, with the simulated data (grey) shown under a non-turnover scenario. B. Coding sequence (blue squares) is seen to be broadly conserved, while constrained noncoding sequence (orange circles) shows a strong negative correlation between αselIndel and divergence, indicating rapid turnover.
Figure 2
Figure 2. The overlap of constrained sequence with pan-mammalian conserved sequences.
The proportions A., and quantities B., of constrained sequence at the present for different types of biochemically annotated and un-annotated sequences, with and without PhastCons or GERP++ conserved elements, estimated using linear extrapolations (Text S6, Text S7). The NIM1 has power to detect functional lineage-specific constrained sequence: NIM1 detects significantly higher fractions of linage-specific constrained sequence (defined as sequence identified by NIM1 but not annotated by PhastCons or GERP++ as being conserved across mammals) within 3 mutually exclusive classes of ENCODE biochemical annotations compared to sequence lacking such annotation; see Text S6 for details.
Figure 3
Figure 3. Constraint and turnover for different classes of human functional element.
A. The total quantities of constrained sequence estimated for the present day by extrapolation for different element types. B. The estimated rate of turnover (b parameter) for different types of constrained element.
Figure 4
Figure 4. Model-based inference of turnover by functional class.
Schematic summary of the fraction of constrained sequence that has been retained (saturated colours) or turned over (pastel colours) in the human lineage over time (X-axis, divergence time) and how it has been distributed across various categories of functional element. In addition to showing the reduced quantity of preserved constrained sequence with increasing divergence, we infer the reciprocal quantity of sequence that is assumed to have been gained over human lineage evolution. For consistency this approach requires mutually exclusive annotation sets, in contrast to those used in Figure 3, making the results not directly comparable. Overlaps between the major different annotations are shown in Figure S10.

References

    1. Pennisi E (2012) Genomics. ENCODE project writes eulogy for junk DNA. Science 337: 1159, 1161. - PubMed
    1. Graur D, Zheng Y, Price N, Azevedo RB, Zufall RA, et al. (2013) On the immortality of television sets: “function” in the human genome according to the evolution-free gospel of ENCODE. Genome Biol Evol 5: 578–590. - PMC - PubMed
    1. Ponting CP, Hardison RC (2011) What fraction of the human genome is functional? Genome Res 21: 1769–1776. - PMC - PubMed
    1. Doolittle WF (2013) Is junk DNA bunk? A critique of ENCODE. Proc Natl Acad Sci U S A 110 14: 5294–300. - PMC - PubMed
    1. Dunham I, Kundaje A, Aldred SF, Collins PJ, Davis CA, et al. (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489: 57–74. - PMC - PubMed

Publication types

LinkOut - more resources