Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Oct;22(10):1983-91.
doi: 10.1093/molbev/msi188. Epub 2005 Jun 8.

Evolutionary diversity and potential recombinogenic role of integration targets of Non-LTR retrotransposons

Affiliations

Evolutionary diversity and potential recombinogenic role of integration targets of Non-LTR retrotransposons

Andrew J Gentles et al. Mol Biol Evol. 2005 Oct.

Abstract

Short interspersed elements (SINEs) make up a significant fraction of total DNA in mammalian genomes, providing a rich substrate for chromosomal rearrangements by SINE-SINE recombinations. Proliferation of mammalian SINEs is mediated primarily by long interspersed element 1 (L1) non-long terminal repeat retrotransposons that preferentially integrate at DNA sequence targets with an average length of approximately 15 bp and containing conserved endonucleolytic nicking signals at both ends. We report that sequence variations in the first of the two nicking signals, represented by a 5'-TT-AAAA consensus sequence, affect the position of the second signal thus leading to target site duplications (TSDs) of different lengths. The length distribution of TSDs appears to be affected also by L1-encoded enzyme variants because targets with the same 5' nicking site can be of different average lengths in different mammalian species. Taking this into account, we reanalyzed the second nicking site and found that it is larger and includes more conserved sites than previously appreciated, with a consensus of 5'-ANTNTN-AA. We also studied potential involvement of the nicking sites in stimulating recombinations between SINEs. We determined that SINEs retaining TSDs with perfect 5'-TT-AAAA nicking sites appear to be lost relatively rapidly from the human and rat genomes and less rapidly from dog. We speculate that the introduction of DNA breaks induced by recurring endonucleolytic attacks at these sites, combined with the ubiquitousness of SINEs, may significantly promote recombination between repetitive elements, leading to the observed losses. At the same time, new L1 subfamilies may be selected for "incompatibility" with preexisting targets. This provides a possible driving force for the continual emergence of new L1 subfamilies which, in turn, may affect selection of L1-dependent SINE subfamilies.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Flanking repeats were identified by alignment of 5′ and 3′ sequences immediately adjacent to repeat elements. Here, an Alu is shown integrated in the genome, and is flanked by TSDs (target site duplications). The first nicking site is inferred as the hexamer starting 2 bp 5′ of the upstream TSD (TT-AAAA in the example shown). As indicated, the TT dinucleotides are part of the first nicking site, but not part of the TSD. The second nicking site is located at the 3′ terminal end of the TSD. The flanking repeat (TSD) lengths are distributed with median 15 bp in most cases (see text).
Figure 2
Figure 2
Proportions of different target types in young SINE elements less than 2% diverged from their consensus, that have a flanking repeat of >10 bp length, with no mismatches between 5′ and 3′ repeat copies.
Figure 3
Figure 3
Distribution of lengths of flanking repeats for TT-AAAA targets, and variants involving a single A→G change (the most common target types observed). In each panel, the horizontal axis shows flanking repeat length, while the vertical axis shows relative frequency. The plots have been smoothed to facilitate visual comparisons. Significance of differences between distributions (evaluated by Kolmogorov-Smirnov tests) is tabulated in Tables 1 and 2, and in supplementary table S1, as discussed in the text.
Figure 4
Figure 4
Determination of the consensus sequence for the second endonucleolytic nicking sequence. The top panel shows χ2 values at the indicated positions relative to the nicking site (see Jurka, 1997 for detailed description of methodology). χ2 values above the horizontal graded line at χ2=16.27 are significant at the p=0.001 level. Nucleotide composition at each position relative to the predicted nicking site is shown in the bottom panel. At positions −1, −3, and −5, no nucleotide is significantly over-represented (“N”). At positions +1, +2, and −6 “A” is unambiguous. “T” has the highest χ2 at positions −2 and −4. Horizontal lines in the bottom panel show the mean composition of the 3′ flanking regions of the elements studied, identified by nucleotide letter on the right-side axis.
Figure 5
Figure 5
Target decay as a function of divergence from consensus (age of repeat copy). Vertical bars are N1/2 estimates of errors, where N is the sample size. Species/SINE type is indicated by separate symbols for human Alu (grey-filled circle, long-dashed line), dog SINEC (white triangle, solid line), and rat SINE elements (black-filled square, dotted line). Repeats were grouped into bins of 1% width. An alternative would be to separate SINEs by subfamily, and use average divergence from consensus together with average target frequencies. This introduces considerable variation induced by genomic context of the SINE, with elements in regions of the genome with a higher mutation rate being more diverged than elements in regions with low mutation rate. The plotted lines are linear regressions of the data points. The slopes derived are as follows: [Table: see text]
Figure 6
Figure 6
TT-AAAA target loss relative to TT-AGAA. Plots show the ratio of TT-AAAA to TT-AGAA targets in human, dog and rat, normalized to be 1 at 0% divergence from consensus. (ie. if the ratio of TT-AAAA to TT-AGAA targets is rd for repeats which are d% diverged from their consensus, then the figure shows rd/r0. The initial ratios between TT-AAAA and TT-AGAA are respectively 3.4:1, 4.2:1 and 7.6:1 in human, dog, rat.
Figure 7
Figure 7
Loss of perfect targets by recombination between two similar SINEs. Each sequence is flanked by a perfect repeat, which is different for the two elements. Recombination results in loss of the 3′ repeat copy of SINE A, and 5′ copy of SINE B. At the same time, the resulting composite element is no longer flanked by a repeat, since the 5′ and 3′ flanking sequences come from SINE A, and B respectively.

Similar articles

Cited by

References

    1. Babcock M, Pavlicek A, Spiteri E, Kashork CD, Ioshikhes I, Shaffer LG, Jurka J, Morrow BE. Shuffling of genes within low-copy repeats on 22q11 (LCR22) by Alu-mediated recombination events during evolution. Genome Res. 2003;13:2519–2532. - PMC - PubMed
    1. Bailey JA, Liu G, Eichler EE. An Alu transposition model for the origin and expansion of human segmental duplications. Am J Hum Genet. 2003;73:823–834. - PMC - PubMed
    1. Bentolila S, Bach JM, Kessler JL, Bordelais I, Cruaud C, Weissenbach J, Panthier JJ. Analysis of major repetitive DNA sequences in the dog (Canis familiaris) genome. Mamm Genome. 1999;10:699–705. - PubMed
    1. Brosius, J. 2005. Echoes from the past - are we still in an RNP world ? Cytogenet. Genome Res., In press. - PubMed
    1. Brouha B, Schustak J, Badge RM, Lutz-Prigge S, Farley AH, Moran JV, Kazazian HH., Jr Hot L1s account for the bulk of retrotransposition in the human population. Proc Natl Acad Sci USA. 2003;100:5280–5. - PMC - PubMed

Publication types