Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Apr 20:2024.04.18.590093.
doi: 10.1101/2024.04.18.590093.

Long-read sequencing and structural variant characterization in 1,019 samples from the 1000 Genomes Project

Long-read sequencing and structural variant characterization in 1,019 samples from the 1000 Genomes Project

Siegfried Schloissnig et al. bioRxiv. .

Update in

  • Structural variation in 1,019 diverse humans based on long-read sequencing.
    Schloissnig S, Pani S, Ebler J, Hain C, Tsapalou V, Söylev A, Hüther P, Ashraf H, Prodanov T, Asparuhova M, Magalhães H, Höps W, Sotelo-Fonseca JE, Fitzgerald T, Santana-Garcia W, Moreira-Pinhal R, Hunt S, Pérez-Llanos FJ, Wollenweber TE, Sivalingam S, Wieczorek D, Cáceres M, Gilissen C, Birney E, Ding Z, Jensen JN, Podduturi N, Stutzki J, Rodriguez-Martin B, Rausch T, Marschall T, Korbel JO. Schloissnig S, et al. Nature. 2025 Aug;644(8076):442-452. doi: 10.1038/s41586-025-09290-7. Epub 2025 Jul 23. Nature. 2025. PMID: 40702182 Free PMC article.

Abstract

Structural variants (SVs) contribute significantly to human genetic diversity and disease 1-4 . Previously, SVs have remained incompletely resolved by population genomics, with short-read sequencing facing limitations in capturing the whole spectrum of SVs at nucleotide resolution 5-7 . Here we leveraged nanopore sequencing 8 to construct an intermediate coverage resource of 1,019 long-read genomes sampled within 26 human populations from the 1000 Genomes Project. By integrating linear and graph-based approaches for SV analysis via pangenome graph-augmentation, we uncover 167,291 sequence-resolved SVs in these samples, considerably advancing SV characterization compared to population-wide short-read sequencing studies 3,4 . Our analysis details diverse SV classes-deletions, duplications, insertions, and inversions-at population-scale. LINE-1 and SVA retrotransposition activities frequently mediate transductions 9,10 of unique sequences, with both mobile element classes transducing sequences at either the 3'- or 5'-end, depending on the source element locus. Furthermore, analyses of SV breakpoint junctions suggest a continuum of homology-mediated rearrangement processes are integral to SV formation, and highlight evidence for SV recurrence involving repeat sequences. Our open-access dataset underscores the transformative impact of long-read sequencing in advancing the characterisation of polymorphic genomic architectures, and provides a resource for guiding variant prioritisation in future long-read sequencing-based disease studies.

PubMed Disclaimer

Publication types

LinkOut - more resources