Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jul 30:7:15.
doi: 10.1186/s13100-016-0072-x. eCollection 2016.

Identification of polymorphic SVA retrotransposons using a mobile element scanning method for SVA (ME-Scan-SVA)

Affiliations

Identification of polymorphic SVA retrotransposons using a mobile element scanning method for SVA (ME-Scan-SVA)

Hongseok Ha et al. Mob DNA. .

Abstract

Background: Mobile element insertions are a major source of human genomic variation. SVA (SINE-R/VNTR/Alu) is the youngest retrotransposon family in the human genome and a number of diseases are known to be caused by SVA insertions. However, inter-individual genomic variations generated by SVA insertions and their impacts have not been studied extensively due to the difficulty in identifying polymorphic SVA insertions.

Results: To systematically identify SVA insertions at the population level and assess their genomic impact, we developed a mobile element scanning (ME-Scan) protocol we called ME-Scan-SVA. Using a nested SVA-specific PCR enrichment method, ME-Scan-SVA selectively amplify the 5' end of SVA elements and their flanking genomic regions. To demonstrate the utility of the protocol, we constructed and sequenced a ME-Scan-SVA library of 21 individuals and analyzed the data using a new analysis pipeline designed for the protocol. Overall, the method achieved high SVA-specificity and over >90 % of the sequenced reads are from SVA insertions. The method also had high sensitivity (>90 %) for fixed SVA insertions that contain the SVA-specific primer-binding sites in the reference genome. Using candidate locus selection criteria that are expected to have a 90 % sensitivity, we identified 151 and 29 novel polymorphic SVA candidates under relaxed and stringent cutoffs, respectively (average 12 and 2 per individual). For six polymorphic SVAs that we were able to validate by PCR, the average individual genotype accuracy is 92 %, demonstrating a high accuracy of the computational genotype calling pipeline.

Conclusions: The new approach allows identifying novel SVA insertions using high-throughput sequencing. It is cost-effective and can be applied in large-scale population study. It also can be applied for detecting potential active SVA elements, and somatic SVA retrotransposition events in different tissues or developmental stages.

Keywords: High-throughput sequencing; ME-Scan; Retrotransposon; SVA.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Experimental protocol design. a Scheme of SVA element structure. b Sequence alignment of SVA_1 and SVA_2 primer binding sites and SVA, Alu subfamily consensuses. The SVA_1 and SVA_2 primer sequences are shown above of the alignment and the amplification directions are indicated by arrows. Top row of the sequence alignment shows the sequences of the primer binding sites of SVA_1 and SVA_2. SVA_1 binding site includes the SVA characteristic deletion as compared to Alu sequences. Dots in the alignment represent the same nucleotides as the primer binding site sequences. Deletions are shown as dashes and mutations are shown as the correct base for the consensus. c SVA-specific amplifications during ME-Scan-SVA library construction and the final DNA fragment structure. The DNA library after second-round amplification is size-selected at ~500 bp (an example electropherogram image is shown). White box: adaptor; grey box: index; dark green box: flanking genomic region; yellow box: TSD; orange box: (CCCTCT)n hexamer simple repeat; light green box: SVA Alu-like region
Fig. 2
Fig. 2
Computational data analyses pipeline. a BLAST-based SVA Read filtering. Location of the 40 bp BLAST query sequence in the Alu-like region in the SVA Read is labelled. The two pair-end sequencing reads are represented by red arrow (SVA Read) and blue arrow (Flanking Read), respectively. b Flanking Read mapping and clustering. After mapping by BWA-MEM, Flanking Reads are filtered based on mapping quality score. Filtered Flanking reads that are within a 500 bp sliding window are then clustered into candidate insertion positions. The color scheme is same as Fig. 1. Black box: clustering window. c Identifying different types of SVA insertions. A representative genomic region (dark green box) is shown. Top row: SVA insertions identified by ME-Scan-SVA. Bottom row: known SVA annotated in the reference genome. Red star: fixed SVA insertions in the reference genome; blue star: known polymorphic SVA insertions
Fig. 3
Fig. 3
Distribution of BLAST bit-scores of the 40 bp Alu-like fragments in SVAs in the human reference genome. X-axis: BLAST bit-scores, Y-axis: the number of SVAs in the human reference genome (hg19) in each bit-score category. Each bar is broken down into color sections based on the RepeatMasker annotation
Fig. 4
Fig. 4
Sensitivity analysis. The sensitivity for identifying fixed SVA insertions under different TPM and UR cutoffs. a average individual sensitivity; b overall sensitivity. The sensitivity is shown as the percentage of fixed insertions identified. Results under relaxed and stringent SVA Read cutoffs are shown in the left and right panel, respectively
Fig. 5
Fig. 5
Allele frequency distribution of polymorphic SVA insertions. The number of individuals having an SVA insertion is shown on the X-axis. The percentage of polymorphic or novel polymorphic SVAs in each individual bin is shown on the Y-axis. a relaxed SVA Read cutoff; b stringent SVA Read cutoff
Fig. 6
Fig. 6
Annotation of SVA insertions. a polymorphic SVAs; b novel polymorphic SVAs. Results under relaxed and stringent SVA Read cutoffs are shown in the left and right panel, respectively
Fig. 7
Fig. 7
Abundance of SVA insertions in chromatin states. Chromatin state profiles (Y-axis) from nine cell lines (X-axis) were obtained from ChromHMM [44]. For each chromatin state, the normalized number of SVA insertions was shown. a polymorphic SVAs; b novel polymorphic SVAs. Results under relaxed and stringent SVA Read cutoffs are shown in the left and right panel, respectively

Similar articles

Cited by

References

    1. de Koning AP, Gu W, Castoe TA, Batzer MA, Pollock DD. Repetitive elements may comprise over two-thirds of the human genome. PLoS Genet. 2011;7(12):e1002384. doi: 10.1371/journal.pgen.1002384. - DOI - PMC - PubMed
    1. Cordaux R, Batzer MA. The impact of retrotransposons on human genome evolution. Nat Rev Genet. 2009;10(10):691–703. doi: 10.1038/nrg2640. - DOI - PMC - PubMed
    1. Beck CR, Garcia-Perez JL, Badge RM, Moran JV. LINE-1 elements in structural variation and disease. Annu Rev Genomics Hum Genet. 2011;12:187–215. doi: 10.1146/annurev-genom-082509-141802. - DOI - PMC - PubMed
    1. Hancks DC, Kazazian HH., Jr Active human retrotransposons: variation and disease. Curr Opin Genet Dev. 2012;22(3):191–203. doi: 10.1016/j.gde.2012.02.006. - DOI - PMC - PubMed
    1. Stewart C, Kural D, Stromberg MP, Walker JA, Konkel MK, Stutz AM, Urban AE, Grubert F, Lam HY, Lee WP, et al. A comprehensive map of mobile element insertion polymorphisms in humans. PLoS Genet. 2011;7(8):e1002236. doi: 10.1371/journal.pgen.1002236. - DOI - PMC - PubMed

LinkOut - more resources