Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2010 Oct;20(10):1313-26.
doi: 10.1101/gr.101386.109. Epub 2010 Jul 22.

Origins, evolution, and phenotypic impact of new genes

Affiliations
Review

Origins, evolution, and phenotypic impact of new genes

Henrik Kaessmann. Genome Res. 2010 Oct.

Abstract

Ever since the pre-molecular era, the birth of new genes with novel functions has been considered to be a major contributor to adaptive evolutionary innovation. Here, I review the origin and evolution of new genes and their functions in eukaryotes, an area of research that has made rapid progress in the past decade thanks to the genomics revolution. Indeed, recent work has provided initial whole-genome views of the different types of new genes for a large number of different organisms. The array of mechanisms underlying the origin of new genes is compelling, extending way beyond the traditionally well-studied source of gene duplication. Thus, it was shown that novel genes also regularly arose from messenger RNAs of ancestral genes, protein-coding genes metamorphosed into new RNA genes, genomic parasites were co-opted as new genes, and that both protein and RNA genes were composed from scratch (i.e., from previously nonfunctional sequences). These mechanisms then also contributed to the formation of numerous novel chimeric gene structures. Detailed functional investigations uncovered different evolutionary pathways that led to the emergence of novel functions from these newly minted sequences and, with respect to animals, attributed a potentially important role to one specific tissue--the testis--in the process of gene birth. Remarkably, these studies also demonstrated that novel genes of the various types significantly impacted the evolution of cellular, physiological, morphological, behavioral, and reproductive phenotypic traits. Consequently, it is now firmly established that new genes have indeed been major contributors to the origin of adaptive evolutionary novelties.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Origin of new gene copies through gene duplication. (A) DNA-based duplication. A common type of segmental duplication—tandem duplication—is shown. It may occur via unequal crossing-over that is mediated by transposable elements (light green). There are different fates of the resulting duplicate genes. For example, one of the duplicates may acquire new functions by evolving new expression patterns and/or novel biochemical protein or RNA functions (see main text for details). (Gold and blue boxes) Exons, (black connecting lines) exon splicing, (red right-angled arrows) transcriptional start sites (TSSs), (gray tubes) nonexonic chromatin. (B) RNA-based duplication (termed retroposition or retroduplication). New retroposed gene copies may arise through the reverse transcription of messenger RNAs (mRNAs) from parental source genes. Functional retrogenes with new functional properties may evolve from these copies after acquisition or evolution of promoters in their 5′ flanking regions that may drive their transcription. (Pink right-angled arrow) TSS, (transparent pink box) additionally transcribed flanking sequence at the insertion site.
Figure 2.
Figure 2.
Origin of new chimeric gene or transcript structures. (A) DNA-based (genomic) gene fusion. Partial duplication (and hence fission) of ancestral source genes precedes juxtaposition of partial duplicates and subsequent fusion (presumably mediated by the evolution of novel splicing signals and/or transcription termination/polyadenylation sites). (B) Transcription-mediated gene fusion. Novel transcript structures may arise from intergenic splicing after evolution of novel splicing signals and transcriptional readthrough from the upstream gene. New chimeric mRNAs may sometimes be reversed transcribed to yield new chimeric retrogenes (see also Fig. 1). (Green, blue, red large boxes) Exons, (red right-angled arrows) transcriptional start sites (TSSs), (black connecting lines) constitutive splicing, (dotted lines) splicing of ancestral gene structures, (green lines) intergenic splicing that results in new chimeric transcripts.
Figure 3.
Figure 3.
Origin of protein-coding genes from scratch. New coding regions may emerge de novo from noncoding genomic sequences. First, proto-open reading frames (proto-ORFs; thin blue bars) acquire mutations (point substitutions, insertions/deletions; yellow stars) that remove, bit by bit, frame-disrupting nucleotides (red wedges). Transcriptional activation of ORFs (through acquisition of promoters located in the 5′ flanking region) encoding proteins with potentially useful functions may allow for the evolution of novel protein-coding genes. (Large blue box) Functional exon, (pink right-angled arrow) TSS, (transparent pink box) untranslated 5′ sequence. Note that the transcriptional activation step may, alternatively, also precede the formation of complete functionally relevant ORFs.
Figure 4.
Figure 4.
Evolutionary origins of long noncoding RNA genes. (A) De novo emergence. In this scenario, previously nonfunctional genomic sequence becomes transcribed (thin red box) through the acquisition/activation of a proto-promoter sequence (right-angled arrows). The transcriptional activation may be followed or preceded by the evolution of (proto-) splice sites (light blue stars). Together, these events allow for the formation of potentially functional and selectively beneficial multi-exonic noncoding RNA genes. (Large red boxes) Exons, (thin black lines) splicing, (red right-angled arrows) TSSs. (B) Origin of noncoding RNA gene from ancestral protein-coding gene. In this process, the original (functionally redundant) protein-coding gene loses its function and becomes a pseudogene. After or during loss of protein function and coding exon decay, a new functional noncoding RNA gene may arise, a process that may draw from regulatory elements and other sequences (splicing signals, exon sequences, polyadenylation sequences, etc.) from the ancestral protein-coding gene. (Blue boxes) Protein-coding exons, (red boxes) RNA exons, (transparent boxes) pseudogenized exons, (thin black lines) splicing, (dotted lines) lost ancestral splicing capacity, (red right-angled arrows) TSSs.
Figure 5.
Figure 5.
New genes from domesticated genome parasites. The example shown illustrates the origin of a new placenta gene from an endogenous retrovirus sequence (the scenario illustrates the origin of one of the several syncytin genes that evolved important placenta functions in mammals [Heidmann et al. 2009]; see main text for details). The domestication event involved the decay of two of the human endogenous retrovirus ORFs (gag and pol) and the selective preservation of the ORF encoding the virus envelope protein. (Empty box) Loss of function/decay, (gold boxes) ORFs. The newly formed syncytin gene (transcript structure indicated by thin black line) became transcribed from the retrovirus' long terminal repeat (LTR; green) promoter (TSS shown as red right-angled arrow) and evolved a placenta-specific expression pattern and (fusogenic) function (Heidmann et al. 2009).
Figure 6.
Figure 6.
The “out of the testis” hypothesis for the emergence of new genes. This hypothesis suggests that the transcription of new gene copies/structures (green boxes) is facilitated in certain testis germ cells—meiotic spermatocytes and post-meiotic round spermatids (which are found in the seminiferous tubules, where spermatogenesis takes place)—because of the potentially overall permissive chromatin state and overexpression of key components of the transcriptional machinery in these cells. The transcriptionally active chromatin state in spermatocytes and spermatids is thought to be a result of a potentially widespread demethylation of CpG dinucleotide-enriched promoter sequences and modifications (acetylation and methylation) of histones (blue ovals), which facilitate access of the transcriptional machinery (red ovals). Once transcribed, new functional genes (transcripts shown as green wavy lines) with beneficial products may be selectively preserved and evolve more efficient promoters (a process that might be facilitated by the fact that spermatocyte/spermatid-specific expression requires only relatively simple promoters). Eventually, such new genes may also evolve more diverse expression patterns and thus also obtain functions in other (somatic) tissues.

Similar articles

Cited by

References

    1. Akiva P, Toporik A, Edelheit S, Peretz Y, Diber A, Shemesh R, Novik A, Sorek R 2006. Transcription-mediated gene fusion in the human genome. Genome Res 16: 30–36 - PMC - PubMed
    1. Assis R, Kondrashov AS 2009. Rapid repetitive element-mediated expansion of piRNA clusters in mammalian evolution. Proc Natl Acad Sci 106: 7079–7082 - PMC - PubMed
    1. Babushok DV, Ohshima K, Ostertag EM, Chen X, Wang Y, Mandal PK, Okada N, Abrams CS, Kazazian HH Jr 2007. A novel testis ubiquitin-binding protein gene arose by exon shuffling in hominoids. Genome Res 17: 1129–1138 - PMC - PubMed
    1. Bai Y, Casola C, Feschotte C, Betran E 2007. Comparative genomics reveals a constant rate of origination and convergent acquisition of functional retrogenes in Drosophila. Genome Biol 8: R11 doi: 10.1186/gb-2007-8-1-r11 - PMC - PubMed
    1. Bailey JA, Gu Z, Clark RA, Reinert K, Samonte RV, Schwartz S, Adams MD, Myers EW, Li PW, Eichler EE 2002. Recent segmental duplications in the human genome. Science 297: 1003–1007 - PubMed

Publication types