Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2005 Aug;15(8):1073-8.
doi: 10.1101/gr.3688905. Epub 2005 Jul 15.

Gene-breaking: a new paradigm for human retrotransposon-mediated gene evolution

Affiliations
Comparative Study

Gene-breaking: a new paradigm for human retrotransposon-mediated gene evolution

Sarah J Wheelan et al. Genome Res. 2005 Aug.

Abstract

The L1 retrotransposon is the most highly successful autonomous retrotransposon in mammals. This prolific genome parasite may on occasion benefit its host through genome rearrangements or adjustments of host gene expression. In examining possible effects of L1 elements on host gene expression, we investigated whether a full-length L1 element inserted in the antisense orientation into an intron of a cellular gene may actually split the gene's transcript into two smaller transcripts: (1) a transcript containing the upstream exons and terminating in the major antisense polyadenylation site (MAPS) of the L1, and (2) a transcript derived from the L1 antisense promoter (ASP) that includes the downstream exons of the gene. Bioinformatic analysis and experimental follow-up provide evidence for this L1 "gene-breaking" hypothesis. We identified three human genes apparently "broken" by L1 elements, as well as 12 more candidate genes. Most of the inserted L1 elements in our 15 candidate genes predate the human/chimp divergence. If indeed split, the transcripts of these genes may in at least one case encode potentially interacting proteins, and in another case may encode novel proteins. Gene-breaking represents a new mechanism through which L1 elements remodel mammalian genomes.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Gene-breaking model. A generic gene is shown, containing an L1 element positioned in the antisense direction in an intron. Three transcripts could result from this arrangement: transcript A, an early-terminating upstream transcript containing the 5′ exons, part of the intron, and part of L1; transcript B, a transcript originating from the antisense promoter in L1 and containing part of L1 spliced to the downstream exons; and transcript C, the native (expected) transcript. Arrows show the direction of transcription; the arrow for the antisense L1 indicates transcription from the native L1 promoter. Red letters, the poly(A) signal; small arrows, polyadenylation sites; TSD, target site duplication; UTR, untranslated region; ORF, open reading frame; ASP, antisense promoter.
Figure 2.
Figure 2.
Alignments to the MAPS region. Alignments of six EST sequences, one predicted gene, and one mRNA sequence to the MAPS region of L1rp. In red is the L1 polyadenylation signal. The sequences terminating in uppercase “A”s have poly(A) tails in their database records. Termination sites, as described in Han et al. (2004), are shown with arrows. Five of the sequences have poly(A) tails starting at one of the known termination sites; these sites are labeled #1, #2, and #3 to correspond with the first, second, and third polyadenylation sites identified by Han et al. and marked with arrows on the L1 sequence. Sequence CN806550 terminates just 5′ of the first termination site; this may represent a previously unknown termination site.
Figure 3.
Figure 3.
RT-PCR cloning of the upstream run-on transcript from NM_024583 (secernin 3). (A) The RT-PCR strategy. The full-length L1 (red box) is located in intron 5 (black line) in the antisense orientation. After cDNA synthesis from the extracted total RNA with the L1-poly T primer (JB8441), the L1-antisense primer (JB8438), which hybridizes just upstream of the MAPS (white star), and the gene specific primer (JB8435), which anneals within exon 4 (green box), were used to amplify the upstream run-on transcript. Exon 5 is shown as a blue box. (B) Gel purification of the PCR product. From the two different human cell lines, Hela (lane 1) and HCT116 (lane 3), ∼650-base pair fragments were generated as major bands whereas no product was amplified without reverse transcriptase on cDNA synthesis (lanes 2,4) and template DNA on PCR (lane 5). The marker is a 100-base pair DNA ladder (lane C). (C) The sequence of the ∼650-base pair PCR product contains the 5′ end of intron 5 (in black) but not intron 4 (orange in A), which was spliced out. Colors as in Figure 2a. In D,E,F, gray boxes and lines indicate sequence and splicing not observed in the database or in our transcripts, as these short sequencing products do not contain the entire mRNA, but which are inferred from the sequence of the full-length mRNA. (D) NM_024583 (secernin 3). The L1 is 6 kb and is shown as an arrow; the direction of the arrow indicates the direction of transcription from the native L1 promoter. The first line shows the full-length gene structure as defined by Spidey. Underneath is the polyadenylated transcript discovered and sequenced here that terminates in the L1, and below that is a previously described transcript that starts in the 5′ end of the L1 and is potentially driven by the ASP. (E) NM_004866 (SCAMP1). The L1 is 6 kb and is shown as an arrow. As in (D), the first line shows the full-length gene structure as defined by Spidey, the second line shows the polyadenylated transcript described here, and the third line shows the database sequence BE566710, which is the ASP-derived product. (F) NM_014960 (arylsulfatase G). The full-length gene structure is shown in the first line. Here there are two arrows for the L1. The first arrow indicates a truncated antisense L1 situated upstream of the full-length antisense L1, indicated by the second arrow. The second line is the product described here that terminates at the MAPS in the truncated L1. The last line depicts the database sequence that appears to be derived from the ASP of the full-length L1 element.
Figure 4.
Figure 4.
The downstream product of the L1 ASP in Secernin 3 (NM_024583). The full gene product (top line) as well as the MAPS-terminated (line 2) and two ASP-derived (lines 3,4) products are shown; the ASP-derived product that we sequenced, in line 4, skips exon 6 of secernin 3, presumably due to splicing, whereas the database sequence contains exon 6 as well as the L1 sequence and part of exon 7. As in Figure 3, gray boxes and lines indicate the expected structure of the mRNAs shown; these sequences are inferred from the known sequence of the full-length mRNA.
Figure 5.
Figure 5.
(A) The MET gene, showing the location of the L1 element as well as the 5′ and L1-promoted ESTs. (B) The MET gene product, showing the cleavage point between the α and β subunits as well as the mapping of the gene's exons onto the protein product.

Similar articles

Cited by

References

    1. Altschul, S., Madden, T., Schaffer, A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25: 3389–3402. - PMC - PubMed
    1. Boissinot, S., Chevret, P., and Furano, A. 2000. LINE-1 retrotransposon evolution and amplification in recent human history. Mol. Biol. Evol. 17: 915–928. - PubMed
    1. Brouha, B., Schustak, J., Badge, R., Lutz-Prigge, S., Farley, A., Moran, J., and Kazazian Jr., H. 2003. Hot L1s account for the bulk of retrotransposition in the human population. Proc. Natl. Acad. Sci. 100: 5280–5285. - PMC - PubMed
    1. Druker, R., Bruxner, T., Lehrbach, N., and Whitelaw, E. 2004. Complex patterns of transcription at the insertion site of a retrotransposon in the mouse. Nucleic Acids Res. 32: 5800–5808. - PMC - PubMed
    1. Gilbert, N., Lutz-Prigge, S., and Moran, J. 2002. Genomic deletions created upon LINE-1 retrotransposition. Cell 110: 315–325. - PubMed

Publication types

LinkOut - more resources