Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov;113(6):3804-3810.
doi: 10.1016/j.ygeno.2021.09.011. Epub 2021 Sep 15.

Characterization of full-length LINE-1 insertions in 154 genomes

Affiliations

Characterization of full-length LINE-1 insertions in 154 genomes

Jessica S Wong et al. Genomics. 2021 Nov.

Abstract

Long interspersed nuclear elements (LINEs) are retrotransposons that contribute to genetic variation in the human genome. LINE-1 elements in larger-scale studies are challenging to identify using sequencing technologies due to cost and scalability. We developed an approach using optical mapping for detection of full-length LINE-1 insertions and 10× sequencing for confirmation. We found 51 true positive full-length LINE-1 insertions, of which 4 are novel insertions, in NA12878. Repeating our analysis on a larger sample set representing 26 populations, we identified 329 full-length LINE-1 elements, of which 123 are novel. 24.8% of these 329 LINE-1 insertions were shared amongst all 5 superpopulations (AFR, AMR, EUR, EAS, SAS). The African superpopulation has a higher percentage of population-specific LINE-1 insertions than any other superpopulation. These data indicate that our approach can provide high-speed, cost-effective, and increased accuracy for LINE-1 detection. These data also provide an insight into variations of LINE-1 elements between different populations.

Keywords: Genomics; LINE-1.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Schematic of LINE-1 detection and confirmation. (A) Label sites on LINE-1 references and their positions within a 6 kb insertion. The yellow dots represent Nt.BspQ1 motif sites and orange dots represent DLE-1 motif sites. Each value on the y-axis represents one out of 300 references, and the x-axis is the position from 0-6 kb within LINE-1. (B) DLE-1 motif positions and Nt.BsqQ1 motif positions are combined to determine the distances between DLE-1 - Nt.BspQ1 and Nt.BspQ1 - DLE-1. (C) Sequencing confirmation of LINE-1 insertion using 10x data shows soft clipped reads (orange) align to LINE-1 but not to hg38 (green).
Figure 2.
Figure 2.
Detection of LINE-1 insertion in optical mapping data. (A) Optical mapping visualization of sample HG04006 in region chr6:72082349-72108286. Green bars represent hg38 reference and blue bars represent sample contig. Black nicks in both DLE-1 and Nt.BspQ1 datasets align to hg38. Orange nicks in the DLE-1 contig and yellow nicks in the Nt.BsqQ1 contig are nicks that do not align to any sites in hg38. The positions of unaligned DLE-1 and Nt.BsqQ1 nick sites within an insertion are used for calculating distances. (B) Soft clipped reads are aligned to LINE-1 reference at 0 bp and 6 kb.
Figure 3.
Figure 3.
Population distribution of all LINE-1 insertion loci. (A) Distribution of LINE-1 insertions between five superpopulations: Loci found only in African (red), East Asian (green), South Asian (orange), European (purple), and American (blue) populations. Percentages are also calculated for LINE-1 insertions shared amongst 2-4 superpopulations (gray) and all populations (black). (B) PCA plot of LINE-1 insertions for all five superpopulations.

References

    1. Abid HZ, Young E, McCaffrey J, Raseley K, Varapula D, Wang H-Y, Piazza D, Mell J, Xiao M. 2020. Customized optical mapping by CRISPR–Cas9 mediated DNA labeling with multiple sgRNAs. Nucleic Acids Research 49: e8–e8. - PMC - PubMed
    1. Adewale BA. 2020. Will long-read sequencing technologies replace short-read sequencing technologies in the next 10 years? Afr J Lab Med 9: 1340–1340. - PMC - PubMed
    1. Altshuler D, Donnelly P, The International HapMap C. 2005. A haplotype map of the human genome. Nature 437: 1299–1320. - PMC - PubMed
    1. Aston C, Mishra B, Schwartz DC. 1999. Optical mapping and its potential for large-scale sequencing projects. Trends Biotechnol 17: 297–302. - PubMed
    1. Beck CR, Collier P, Macfarlane C, Malig M, Kidd JM, Eichler EE, Badge RM, Moran JV. 2010. LINE-1 Retrotransposition Activity in Human Genomes. Cell 141: 1159–1170. - PMC - PubMed

Publication types

LinkOut - more resources