Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2017 Jun 1;9(6):1351-1373.
doi: 10.1093/gbe/evx081.

The Genomic Impact of Gene Retrocopies: What Have We Learned from Comparative Genomics, Population Genomics, and Transcriptomic Analyses?

Affiliations
Comparative Study

The Genomic Impact of Gene Retrocopies: What Have We Learned from Comparative Genomics, Population Genomics, and Transcriptomic Analyses?

Claudio Casola et al. Genome Biol Evol. .

Abstract

Gene duplication is a major driver of organismal evolution. Gene retroposition is a mechanism of gene duplication whereby a gene's transcript is used as a template to generate retroposed gene copies, or retrocopies. Intriguingly, the formation of retrocopies depends upon the enzymatic machinery encoded by retrotransposable elements, genomic parasites occurring in the majority of eukaryotes. Most retrocopies are depleted of the regulatory regions found upstream of their parental genes; therefore, they were initially considered transcriptionally incompetent gene copies, or retropseudogenes. However, examples of functional retrocopies, or retrogenes, have accumulated since the 1980s. Here, we review what we have learned about retrocopies in animals, plants and other eukaryotic organisms, with a particular emphasis on comparative and population genomic analyses complemented with transcriptomic datasets. In addition, these data have provided information about the dynamics of the different "life cycle" stages of retrocopies (i.e., polymorphic retrocopy number variants, fixed retropseudogenes and retrogenes) and have provided key insights into the retroduplication mechanisms, the patterns and evolutionary forces at work during the fixation process and the biological function of retrogenes. Functional genomic and transcriptomic data have also revealed that many retropseudogenes are transcriptionally active and a biological role has been experimentally determined for many. Finally, we have learned that not only non-long terminal repeat retroelements but also long terminal repeat retroelements play a role in the emergence of retrocopies across eukaryotes. This body of work has shown that mRNA-mediated duplication represents a widespread phenomenon that produces an array of new genes that contribute to organismal diversity and adaptation.

Keywords: new functions; pollen expression; regulatory element; retroCNV; retrocopy; retrogene; retropseudogene; testis expression.

PubMed Disclaimer

Figures

F<sc>ig</sc>. 1.
Fig. 1.
—(A) LINEs are autonomous non-LTR retrotransposable elements. They are transcribed by RNA polymerase II and their transcript encodes for a retrotranscriptase (RT) and additional activities (RNA binding and endonuclease activity within others; some not well characterized) in two open reading frames (Cost et al. 2002). The poly(A) of the L1 transcripts is bound by its proteins in the cytoplasm after the transcript is translated and carried to the nucleus. In the nucleus, the transcript undergoes target-primed reverse transcription (TPRT) after the endonuclease nicks the DNA and a 3′ end is available to prime the RT reaction (Cost et al. 2002). It is still unclear how all the reactions occur (Mandal and Kazazian 2016) but there are some clear hallmarks of L1-mediated retrotransposition: a short remnant of the poly(A) tail at the 3′ and target side duplications (TSDs; Vanin 1985). TPRT often produces 5′ truncated copies but well preserved 3′ end of the element including the short remnant poly(A) tail (Zingler et al. 2005). L1 elements can mediate the retrotransposition of mRNAs in cells. The poly(A) of cytoplasmic mRNAs might be recognized by L1 proteins, carried to the nucleus, undergo TPRT and be inserted in the genome. The hallmarks of this process are going to be the presence of an intronless copy of a gene with a remnant of a poly(A) tail at the 3′ end and TSDs flanking the insertion. 3′-UTRs will often be complete while there might be 5′ truncated copies from the onset (often the 5′-UTR can be shorter and sometimes the CDS can be affected by this truncation likely producing a retropseudogene). (B) L1 can produce the transduction of downstream regions when the transcript produced by the element is unusually long (i.e., it is not polyadenylated at the typical polyadenylation site) and includes the downstream region that could potentially encompass a gene (Goodier et al. 2000; Pickeral et al. 2000). (C) Autonomous LTR retrotransposable elements are transcribed and translated and the proteins encoded assemble a viral-like particle where reverse transcription of the LTR RNA occurs. The product will be a double-stranded LTR retroelement flanked by LTRs that proteins bind, bring to the nucleus and integrate in the genome (Levin and Moran 2011). When the reverse transcriptase switches transcripts might start retrotranscribing templates of cellular mRNAs and as it switches again to retrotranscribed the end of the LTR transcript produces a retrocopy that will often be an incomplete CDS but can potentially contain a whole retrocopy of cellular genes (Derr et al. 1991; Tan et al. 2016). The template switching will occur if there are by chance regions of sequence similarity between the mRNA and the nascent cDNA (Derr et al. 1991). When the template switching occurs more than once, chimeric retrocopies can be produced (Wang et al. 2006; Tan et al. 2016). These retrocopies or more often partial retrocopies will be flanked by LTR sequences on both sides.
F<sc>ig</sc>. 2.
Fig. 2.
—(A) Reads from the newly sequenced genome are mapped to the reference genome. Reads from the parental gene will map well (Set 1). Read from the retroCNV will not map well (Sets 2–4). They will hit the region of the parental region but there will be problems: reads spanning exon–exon junctions (Set 2), pair reads mapping farther apart than expected from de sequencing protocol (Set 3), and discordant reads (i.e., reads mapping to different regions in the genome; Set 4). (B) The reads that do not map well can be used to assemble the retroCNV.
F<sc>ig</sc>. 3.
Fig. 3.
—Mechanisms for retrocopy transcription. (A) The retrocopy might acquire regulatory regions from an existing gene after inserting within that gene. (B) Regulatory regions might be carried ouver from the parental gene, if an aberrant/longer transcript of the parental gene is produced. (C) A regulatory region might evolve from a proto-regulatory region or from a region with no regulatory function. (D) A retrocopy might be expressed from a bidirectional regulatory region, if inserted in head-to-head orientation. (E) A retrocopy might express if inserted between a gene and its regulatory region. See text for more details.
Figure Box 1.
Figure Box 1.
—Illustration of the different fates and steps in the retrogene “life cycle” that can be compared to make evolutionary inferences. See text for more details.

Similar articles

Cited by

References

    1. Abdelsamad A, Pecinka A.. 2014. Pollen-specific activation of Arabidopsis retrogenes is associated with global transcriptional reprogramming. Plant Cell 26:3299–3313. - PMC - PubMed
    1. Abegglen LM, et al.2015. Potential mechanisms for cancer resistance in elephants and comparative cellular response to DNA damage in humans. JAMA 314:1850–1860. - PMC - PubMed
    1. Abyzov A, et al.2013. Analysis of variable retroduplications in human populations suggests coupling of retrotransposition to cell division. Genome Res. 23:2042–2052. - PMC - PubMed
    1. Andersen RD, Birren BW, Taplitz SJ, Herschman HR.. 1986. Rat metallothionein-1 structural gene and three pseudogenes, one of which contains 5′-regulatory sequences. Mol Cell Biol. 6:302–314. - PMC - PubMed
    1. Ashworth A, Skene B, Swift S, Lovell-Badge R.. 1990. Zfa is an expressed retroposon derived from an alternative transcript of the Zfx gene. EMBO J. 9:1529–1534. - PMC - PubMed

Publication types

LinkOut - more resources