This is a preprint.
Long-read sequencing and structural variant characterization in 1,019 samples from the 1000 Genomes Project
- PMID: 38659906
- PMCID: PMC11042266
- DOI: 10.1101/2024.04.18.590093
Long-read sequencing and structural variant characterization in 1,019 samples from the 1000 Genomes Project
Update in
-
Structural variation in 1,019 diverse humans based on long-read sequencing.Nature. 2025 Aug;644(8076):442-452. doi: 10.1038/s41586-025-09290-7. Epub 2025 Jul 23. Nature. 2025. PMID: 40702182 Free PMC article.
Abstract
Structural variants (SVs) contribute significantly to human genetic diversity and disease 1-4 . Previously, SVs have remained incompletely resolved by population genomics, with short-read sequencing facing limitations in capturing the whole spectrum of SVs at nucleotide resolution 5-7 . Here we leveraged nanopore sequencing 8 to construct an intermediate coverage resource of 1,019 long-read genomes sampled within 26 human populations from the 1000 Genomes Project. By integrating linear and graph-based approaches for SV analysis via pangenome graph-augmentation, we uncover 167,291 sequence-resolved SVs in these samples, considerably advancing SV characterization compared to population-wide short-read sequencing studies 3,4 . Our analysis details diverse SV classes-deletions, duplications, insertions, and inversions-at population-scale. LINE-1 and SVA retrotransposition activities frequently mediate transductions 9,10 of unique sequences, with both mobile element classes transducing sequences at either the 3'- or 5'-end, depending on the source element locus. Furthermore, analyses of SV breakpoint junctions suggest a continuum of homology-mediated rearrangement processes are integral to SV formation, and highlight evidence for SV recurrence involving repeat sequences. Our open-access dataset underscores the transformative impact of long-read sequencing in advancing the characterisation of polymorphic genomic architectures, and provides a resource for guiding variant prioritisation in future long-read sequencing-based disease studies.
Publication types
Grants and funding
LinkOut - more resources
Full Text Sources