This is a preprint.
Benchmarking, detection, and genotyping of structural variants in a population of whole-genome assemblies using the SVGAP pipeline
- PMID: 39975360
- PMCID: PMC11839052
- DOI: 10.1101/2025.02.07.637096
Benchmarking, detection, and genotyping of structural variants in a population of whole-genome assemblies using the SVGAP pipeline
Abstract
Comparisons of complete genome assemblies offer a direct procedure for characterizing all genetic differences among them. However, existing tools are often limited to specific aligners or optimized for specific organisms, narrowing their applicability, particularly for large and repetitive plant genomes. Here, we introduce SVGAP, a pipeline for structural variant (SV) discovery, genotyping, and annotation from high-quality genome assemblies at the population level. Through extensive benchmarks using simulated SV datasets at individual, population, and phylogenetic contexts, we demonstrate that SVGAP performs favorably relative to existing tools in SV discovery. Additionally, SVGAP is one of the few tools to address the challenge of genotyping SVs within large assembled genome samples, and it generates fully genotyped VCF files. Applying SVGAP to 26 maize genomes revealed hidden genomic diversity in centromeres, driven by abundant insertions of centromere-specific LTR-retrotransposons. The output of SVGAP is well-suited for pan-genome construction and facilitates the interpretation of previously unexplored genomic regions.
Figures
References
-
- Gaut B.S., Seymour D.K., Liu Q. and Zhou Y. (2018) Demography and its effects on genomic variation in crop domestication. Nat Plants, 4, 512–520. - PubMed
-
- Escaramís G., Docampo E. and Rabionet R. (2015) A decade of structural variants: description, history and methods to detect structural variation. Brief. Funct. Genomics, 14, 305–314. - PubMed
Publication types
Grants and funding
LinkOut - more resources
Full Text Sources