Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2016 May 6:7:9.
doi: 10.1186/s13100-016-0065-9. eCollection 2016.

Roles for retrotransposon insertions in human disease

Affiliations
Review

Roles for retrotransposon insertions in human disease

Dustin C Hancks et al. Mob DNA. .

Abstract

Over evolutionary time, the dynamic nature of a genome is driven, in part, by the activity of transposable elements (TE) such as retrotransposons. On a shorter time scale it has been established that new TE insertions can result in single-gene disease in an individual. In humans, the non-LTR retrotransposon Long INterspersed Element-1 (LINE-1 or L1) is the only active autonomous TE. In addition to mobilizing its own RNA to new genomic locations via a "copy-and-paste" mechanism, LINE-1 is able to retrotranspose other RNAs including Alu, SVA, and occasionally cellular RNAs. To date in humans, 124 LINE-1-mediated insertions which result in genetic diseases have been reported. Disease causing LINE-1 insertions have provided a wealth of insight and the foundation for valuable tools to study these genomic parasites. In this review, we provide an overview of LINE-1 biology followed by highlights from new reports of LINE-1-mediated genetic disease in humans.

Keywords: Alu; Autoimmunity; Cancer; Disease; LINE-1; Retrotransposition; Retrotransposon; SVA.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Retrotransposons active in humans. a An autonomous active LINE-1. A full-length LINE-1 ~ 6 kb in length is shown [36, 41, 239]. LINE-1 encodes three proteins, two of which (ORF1p and ORF2p) are absolutely required for retrotransposition in cis [42, 146]. Currently, the role for ORF0p is unclear [60]; interestingly, it may form fusion proteins with downstream coding sequences by utilizing internal splice donor sites (SD) [60]. LINE-1 transcription is driven from its own promoter (big black bent arrow) [53, 54] located in the 5′-UTR. The 5′-UTR also encodes a weaker antisense promoter (ASP, small black bent arrow) [59]. It has been postulated that the LINE-1 ASP in conjunction, with splice acceptors located on the antisense strand of LINE-1, may contribute to new gene formation via a mechanism termed “gene-breaking [240].” Termination of LINE-1 transcription is mediated by a polyA signal (AATAAA) located in the 3′-UTR. Occasionally, transcription proceeds past the internal polyA signal and terminates at a downstream one [139, 241]. Such chimeric transcripts, if retrotransposed, may result in 3′-transductions [, –64, 176]. Majority of insertions end in a polyA tail (AAAn) of variable length [37]. In addition, most insertions are characterized by flanking target-site duplications (4-20 bp in length, black horizontal arrows) [35]. CC-coiled coiled domain [47], RRM-RNA recognition motif [44], CTD-C-terminal domain, EN-endonuclease [51], Z domain [242], RT-reverse transcriptase [52], C-cysteine-rich. AA-amino acid. b The Alu SINE. Alus are small Pol III transcribed RNAs derived from 7SL RNA [243]. An Alu element consists of a left and right monomer, which are derived from an ancient duplication event, separated by an internal A-rich sequence. Alus contain their own transcriptional signals, an A and B box located in the left monomer. Efficient Alu transcription requires a strong enhancer element in the upstream flanking sequence [103, 104]. Transcription termination of an Alu typically occurs at a Pol III terminator (TTTT) located in the downstream flanking sequence [244]. Similar to LINE-1, Alu insertions end in a polyA tail and are flanked by a target-site duplication. c A canonical SINE-VNTR-Alu (SVA) element consisting of its primary domains: CCCTCT hexamer, Alu-like, VNTR, SINE-R derived from the env gene and right LTR from a HERV-K is shown [126]. SVA transcription can initiate upstream (black bent arrow) or in the CCCTCT hexamer (black bent arrow) [126, 127]. Like LINE-1, SVA transcription typically terminates at its own [127] or a downstream polyA signal [24, 65]. d A processed pseudogene (PP) is shown. Note the lack of introns and the presence of a target-site duplication and a 3′-polyA tail similar to LINE-1, Alu, and SVA. e U6 chimera insertion. A U6 snRNA fused with the 3′-end of an LINE-1 sequence formed by “template-switching” [84, 140, 144] is shown. Although the site where ORF2p switches templates varies across the U6 chimera insertions, the junction where the two sequences are joined is typically T-rich [144]
Fig. 2
Fig. 2
Anatomy of retrotransposon insertions. A variety of structures for retrotransposon insertions (a-k) identified by genomic studies, cell culture retrotransposition assays, and disease-causing insertions that have been reported is shown. Reported frequencies, either from genomic analysis or cell-culture retrotransposition assays, for each structure is located in the upper right hand corner of each panel. If no frequency data has been reported only the element’s name is shown. These structures have provided key insights into the mechanism of target-primed reverse transcription [77], retrotransposon transcript structure [127], and the mechanism by which LINE-1-mediated retrotransposition events contribute to genome evolution [62]. A) LINE-1 target-site. Most insertions occur at asymmetric AT-rich sequences [6, 37, 51, 86]. The first step of TPRT is cleavage of the bottom-strand by ORF2p endonuclease activity at a motif resembling 5′-TTTT/AA-3′ [245]. The nuclease responsible for top-strand cleavage is currently unknown. The nature of the staggered cleavage events generates a target-site duplication (TSD, sequence in bold). a TSD (black horizontal arrows) is used to define the boundaries of an insertion and considered a hallmark of LINE-1-mediated retrotransposition events. b Full-length insertion. It is generally accepted that in order for an element to be retrotransposition-competent it must be full-length. c 5′-truncated insertions. Most LINE-1 s in the human genome are grossly truncated at their 5′-end [6, 36, 37]. In contrast, most Alus [243] and SVA elements are full-length [123, 127]. To date, no consensus sequence has been identified in LINE-1 or SVA insertions regarding the mechanism of 5′-truncation. However, a new report implicates stem-loop structures as a factor driving 5′-truncation in recent Alu insertions [114]. d 3′-transduction. Although the first report of a 3′-transduction was an LINE-1 insertion into the dystrophin gene resulting in Duchenne’s muscular dystrophy in 1994 [176], it would be several years before the significance of this chimeric insertion was uncovered. Several years later, as one of the first insights gained from insertions recovered from cell-culture retrotransposition assays, it was reported that LINE-1 frequently bypassed its own polyA signal (AATAAA) in favor of a downstream one (AATAAA) [42]. Subsequently, elegant experimental analysis revealed that utilizing a downstream polyA signal could result in LINE-1-mediated exon-shuffling [62]. An insertion containing a 3′-transduction will typically contain two homopolymer stretches (AAAn) and contain the 3′-TSD from the source locus (gray horizontal arrow) as part of the transduced sequence. Notably, insertions containing serial 3′-transductions have been reported and can be used to track the evolutionary history of an element [246]. e 5′-end inversions. Another hallmark of LINE-1-mediated retrotransposition events is the inversion of the 5′-end (gray horizontal arrow) of the retrotransposon sequence [35]. Small indels are typically identified at the inversion breakpoint [88]. Inversions have only been reported for LINE-1 s, SVAs, and processed pseudogenes [196]. 5′-end inversion is presumed not to occur for Alus due to their short length. It has been hypothesized that a phenomenon referred to as twin-priming may account for the frequent inversions associated with LINE-1-mediated retrotransposition events [88]. f 5′-transduction. In some instances, LINE-1 [82] or SVA transcription [126, 127] may initiate upstream of the internal promoter generating a chimeric transcript. Retrotransposition of this sequence results in duplication of the sequence 5′- of the source locus at a new genomic location. It has been speculated that 5′-transductions are relatively common for SVA elements due to their weaker internal promoter compared to LINE-1, which has a very strong internal promoter, where only a handful of 5′-transductions have been reported [82]. g Internal priming. Occasionally following bottom-strand cleavage, internal A-rich sequences upstream in the retrotransposon RNA may basepair with the T-rich overhang at the target-site instead of the 3′-polyA tail, followed by first-strand cDNA synthesis by ORF2p [–249]. These insertions can be deemed a type of 3′-truncation. h Exon-trapping. Retrotransposons are dispersed throughout the genome including intronic sequence. LINE-1, Alu, SVA all have been reported to contain numerous splice sites and be incorporated into the transcriptome [105, 127, 128, 134, 250]. Interestingly, LINE-1 internal splicing can generate a transcript lacking ORF1 but maintaining a functional ORF2 [251]. In some instances, at least for SVA, retrotransposition of chimeric transcripts containing upstream exons may occur [127, 128, 132]. Notably, SVA itself is thought to have originated from alternative splicing from genomic repeats [126] and SVA-related elements (e.g. LAVA, PVA) appear to have acquired distinct 3′-domains via splicing in gibbons [125, 135, 136, 138]. I) 3′-truncation. Premature polyadenylation using either canonical or non-canonical polyadenylation sites results in LINE-1 or SVA RNAs lacking 3′-sequence [127, 252]. If this RNA is retrotransposed, it will result in a 3′-truncated insertion. Consistent with the dispensability of SVA domains [130], 3′-truncations may be more frequent for SVA compared to LINE-1. In principle, 3′-truncated LINE-1 RNAs containing ORF1 coding sequence might be actively retrotransposed as in the case of ORF1 mNEOi in cell culture [144] and the presence of half-LINE-1 (HAL1) insertions in mammalian genomes [253]. j Target-site deletion. Another surprise from cell culture retrotransposition assays was the discovery of large deletions associated with new retrotransposition events [82, 83]. Genomic deletions up to 1 MB have been associated with LINE-1 mediated retrotransposition events in vivo [153]. These insertions occur at a LINE-1 EN cleavage site, are generated by ORF2 reverse-transcriptase activity, and end in a 3-polyA tail. Currently, the mechanism driving 5-targe-site deletions is unclear; yet, it is tempting to speculate that chromatin looping along with cleavage by LINE-1 or another nuclease may play important roles [82, 83]. k Endonuclease-independent (ENi) insertion. Eni insertions were discovered by the Moran lab when carrying out retrotransposition assays in different Chinese Hamster Ovary (CHO) cell lines lacking key DNA repair factors [213]. Frequent retrotransposition was observed for an engineered LINE-1 element construct, with a catalytically inactive EN, in these cells but not HeLa cells. Characterization of recovered insertions revealed LINE-1 integration at genomic sites not resembling the LINE-1 EN consensus cleavage site. In addition, the insertions were typically truncated at both the 5′-and 3′-ends [213]. These data suggest that LINE-1 can serve as a “molecular band-aid” [254] at double-stranded DNA breaks [–215] and that LINE-1 s lacking a functional EN domain may be able to retrotranspose in certain contexts. Building on these studies it was later reported that LINE-1 s can also integrate at dysfunctional telomeres in an endonuclease-independent manner [216]

References

    1. The Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome. Nature. 2002;420:520–562. doi: 10.1038/nature01262. - DOI - PubMed
    1. de Koning APJ, Gu W, Castoe TA, Batzer MA, Pollock DD. Repetitive Elements May Comprise Over Two-Thirds of the Human Genome. PLoS Genet. 2011;7:e1002384. doi: 10.1371/journal.pgen.1002384. - DOI - PMC - PubMed
    1. The Chimpanzee Sequencing and Analysis Consortium Initial sequence of the chimpanzee genome and comparison with the human genome. Nature. 2005;437:69–87. doi: 10.1038/nature04072. - DOI - PubMed
    1. Smit AF. Interspersed repeats and other mementos of transposable elements in mammalian genomes. Curr Opin Genet Dev. 1999;9:657–663. doi: 10.1016/S0959-437X(99)00031-3. - DOI - PubMed
    1. Smit AF. The origin of interspersed repeats in the human genome. Curr Opin Genet Dev. 1996;6:743–748. doi: 10.1016/S0959-437X(96)80030-X. - DOI - PubMed