Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Apr 24:8:6.
doi: 10.1186/s13100-017-0089-9. eCollection 2017.

Insertion and deletion polymorphisms of the ancient AluS family in the human genome

Affiliations

Insertion and deletion polymorphisms of the ancient AluS family in the human genome

Maria S Kryatova et al. Mob DNA. .

Abstract

Background: Polymorphic Alu elements account for 17% of structural variants in the human genome. The majority of these belong to the youngest AluY subfamilies, and most structural variant discovery efforts have focused on identifying Alu polymorphisms from these currently retrotranspositionally active subfamilies. In this report we analyze polymorphisms from the evolutionarily older AluS subfamily, whose peak activity was tens of millions of years ago. We annotate the AluS polymorphisms, assess their likely mechanism of origin, and evaluate their contribution to structural variation in the human genome.

Results: Of 52 previously reported polymorphic AluS elements ascertained for this study, 48 were confirmed to belong to the AluS subfamily using high stringency subfamily classification criteria. Of these, the majority (77%, 37/48) appear to be deletion polymorphisms. Two polymorphic AluS elements (4%) have features of non-classical Alu insertions and one polymorphic AluS element (2%) likely inserted by a mechanism involving internal priming. Seven AluS polymorphisms (15%) appear to have arisen by the classical target-primed reverse transcription (TPRT) retrotransposition mechanism. These seven TPRT products are 3' intact with 3' poly-A tails, and are flanked by target site duplications; L1 ORF2p endonuclease cleavage sites were also observed, providing additional evidence that these are L1 ORF2p endonuclease-mediated TPRT insertions. Further sequence analysis showed strong conservation of both the RNA polymerase III promoter and SRP9/14 binding sites, important for mediating transcription and interaction with retrotransposition machinery, respectively. This conservation of functional features implies that some of these are fairly recent insertions since they have not diverged significantly from their respective retrotranspositionally competent source elements.

Conclusions: Of the polymorphic AluS elements evaluated in this report, 15% (7/48) have features consistent with TPRT-mediated insertion, thus suggesting that some AluS elements have been more active recently than previously thought, or that fixation of AluS insertion alleles remains incomplete. These data expand the potential significance of polymorphic AluS elements in contributing to structural variation in the human genome. Future discovery efforts focusing on polymorphic AluS elements are likely to identify more such polymorphisms, and approaches tailored to identify deletion alleles may be warranted.

Keywords: Alu; AluS; Mobile element; Mobilome; Polymorphism; Retrotransposon; SINE; Structural variation.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Diagnostic nucleotides differentiate AluS and AluY elements. a Five diagnostic nucleotides that distinguish RepBase consensus sequences of all six AluS subfamilies (AluSc, AluSg, AluSp, AluSq, AluSx, AluSz,) from the most common AluY subfamilies (AluY, AluYa5, AluYa8, AluYb8, AluYb9, AluYc1) were identified [3]. b Five diagnostic positions indicated in part (a) in the context of Alu sequence confirms AluS subfamily classification. The seven AluS TPRT insertion candidates are shown. AluS specific nucleotides at the diagnostic positions are highlighted in green, AluY specific nucleotides at the diagnostic positions are highlighted in red, and nucleotides at the diagnostic positions that are neither AluS nor AluY specific are highlighted in yellow. Further analysis of the polymorphic Alu element at 11q14.1, which has features of both AluS and AluY elements, that led to its ultimate classification as an AluS element is shown in Additional file 2: Figure S1
Fig. 2
Fig. 2
Characterization of the 39 AluS deletion polymorphism candidates. a The pre-deletion allele contains the Alu element (dark gray block arrow); flanking genomic sequence is depicted as a thin dark gray line on both sides of the Alu element. b Schematic of the seven categories of post-deletion alleles observed among the 39 AluS deletion polymorphism candidates. Polymorphic sequences are depicted in light gray; the dark gray parts indicate the sequences that do not vary among individuals. Deletion polymorphism candidates are defined as polymorphisms that encompass only a portion of the Alu element at that locus (e.g. the top five post-deletion allele categories) or that are not limited to only the Alu element (e.g. the bottom four categories). The last category (marked by an asterisk) includes two elements that have features of non-classical Alu insertions and may not be true deletion polymorphisms (see Results). All Alu elements are shown 5′ to 3′. Observed frequencies of each post-deletion allele category among the 39 AluS deletion polymorphism candidates are shown
Fig. 3
Fig. 3
AluS insertion polymorphism candidates. a Empty (pre-insertion) allele prior to AluS element insertion with the target site (TS) sequence noted. b Filled allele after a classical TPRT insertion. The ~300 bp long Alu element consists of two monomers separated by an A-rich region, and also contains a 3′ poly-A tail (An). The TS sequence is duplicated (TSD) and flanks the Alu insertion. c Of the 11 initial AluS insertion polymorphism candidates, ten were PCR validated to be polymorphic in the population. Of these, seven (70%) are full-length elements with 3′ poly-A tails, flanked by TSDs, and are thus classical TPRT insertion candidates. One AluS insertion polymorphism candidate (10%) is full-length and flanked by TSDs, but lacks a 3′ poly-A tail, and thus likely arose by a mechanism involving internal priming (IP). Two AluS insertion polymorphism candidates (20%) are both 5′ and 3′ truncated, lack flanking TSDs, and include non-Alu sequence (shown in purple), thus exhibiting features of non-classical Alu insertions (NCAI). d Characteristics of AluS insertion polymorphism candidates
Fig. 4
Fig. 4
L1 ORF2p endonuclease cleavage sites for all classical TPRT insertion candidates (n = 7) displayed as a WebLogo diagram [47]. The negative strand is depicted, 5′ to 3′. The published consensus sequence is depicted below in grayscale [7]
Fig. 5
Fig. 5
Estimated age and degree of divergence from subfamily consensus sequence of AluS TPRT insertion candidates. a For each of the AluS TPRT insertion candidates (n = 7) percent divergence from the respective AluS subfamily consensus sequence is shown along with estimated ages for the elements based on CpG and non-CpG substitution rates. b Boxplot of percent divergence from the respective AluS subfamily consensus sequence of all AluS elements annotated in the reference genome (n = 686,955) and the TPRT insertion candidates (n = 7). c The TPRT insertion candidates (n = 7) are significantly less diverged from their respective AluS subfamily consensus sequence than are all the AluS elements annotated in the reference genome (n = 686,955) from their respective subfamily consensus sequence (p = 0.0038, permutation test). The distribution of the mean percent divergence of 1 × 106 random samples of n = 7 drawn from the total 686,955 AluS elements annotated in the reference genome is shown. The mean percent divergence of the seven TPRT insertion candidates is shown as a vertical red line
Fig. 6
Fig. 6
Conservation of functionally significant Alu sequence features in the seven classical TPRT insertion candidates. a WebLogo diagram [47] of the RNA polymerase III promoter A and B boxes, with the published consensus sequences depicted below in grayscale [33]. CpG sites are indicated by an orange arrow. b WebLogo diagram [47] of the SRP9/14 binding sites. Previously reported most highly conserved sites within the SRP9/14 binding sites of elements capable of retrotransposition are underlined by a purple bar [16]. CpG sites are indicated by an orange arrow. c WebLogo diagram [47] of two AC dinucleotides in the Alu sequence reported to play a critical role in maintaining the closed loop conformation of Alu RNA that is important for interaction with SRP9/14 and efficient retrotransposition [34]

References

    1. Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, Alkan C, Abyzov A, Yoon SC, Ye K, Cheetham RK, et al. Mapping copy number variation by population-scale genome sequencing. Nature. 2011;470(7332):59–65. doi: 10.1038/nature09708. - DOI - PMC - PubMed
    1. Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, Zhang Y, Ye K, Jun G, Fritz MHY, et al. An integrated map of structural variation in 2,504 human genomes. Nature. 2015;526(7571):75–+. doi: 10.1038/nature15394. - DOI - PMC - PubMed
    1. Batzer MA, Deininger PL. Alu repeats and human genomic diversity. Nat Rev Genet. 2002;3(5):370–9. doi: 10.1038/nrg798. - DOI - PubMed
    1. Dewannieux M, Esnault C, Heidmann T. LINE-mediated retrotransposition of marked Alu sequences. Nat Genet. 2003;35(1):41–8. doi: 10.1038/ng1223. - DOI - PubMed
    1. Feng Q, Moran JV, Kazazian HH, Jr, Boeke JD. Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition. Cell. 1996;87(5):905–16. doi: 10.1016/S0092-8674(00)81997-2. - DOI - PubMed

LinkOut - more resources