Multi-platform discovery of haplotype-resolved structural variation in human genomes
- PMID: 30992455
- PMCID: PMC6467913
- DOI: 10.1038/s41467-018-08148-z
Multi-platform discovery of haplotype-resolved structural variation in human genomes
Abstract
The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.
Conflict of interest statement
J.K., C.-S.C., C.C.L., and A.M.W. are employees and shareholders of Pacific Biosciences (aka PacBio); A.R.H., T.A., H.C., E.T.L., J.L., and A.W.C.P. are employees and shareholders of Bionano Genomics; D.M.C., W.H.H., P.M., S.K.-P., and W.X. are employees and shareholders of 10X Genomics; J.F. is an employee of Illumina; J.E.L. is an employee of DNALink; S.P.L. is an employee of TreeCode Sdn Bhd. P.F. is a member of the scientific advisory board (SAB) of Fabric Genomics, Inc., and Eagle Genomics, Ltd. E.E.E. is on the SAB of DNAnexus, Inc. and was a consultant for Kunming University of Science and Technology (KUST) as part of the 1000 China Talent Program (2014–2016). C.L. was on the SAB of Bionano Genomics. All other authors declare no competing interests.
Figures




Similar articles
-
A Comparison of Structural Variant Calling from Short-Read and Nanopore-Based Whole-Genome Sequencing Using Optical Genome Mapping as a Benchmark.Genes (Basel). 2024 Jul 16;15(7):925. doi: 10.3390/genes15070925. Genes (Basel). 2024. PMID: 39062704 Free PMC article.
-
VISTA: an integrated framework for structural variant discovery.Brief Bioinform. 2024 Jul 25;25(5):bbae462. doi: 10.1093/bib/bbae462. Brief Bioinform. 2024. PMID: 39297879 Free PMC article.
-
SvABA: genome-wide detection of structural variants and indels by local assembly.Genome Res. 2018 Apr;28(4):581-591. doi: 10.1101/gr.221028.117. Epub 2018 Mar 13. Genome Res. 2018. PMID: 29535149 Free PMC article.
-
Genomic Analysis in the Age of Human Genome Sequencing.Cell. 2019 Mar 21;177(1):70-84. doi: 10.1016/j.cell.2019.02.032. Cell. 2019. PMID: 30901550 Free PMC article. Review.
-
Genetic variation and the de novo assembly of human genomes.Nat Rev Genet. 2015 Nov;16(11):627-40. doi: 10.1038/nrg3933. Epub 2015 Oct 7. Nat Rev Genet. 2015. PMID: 26442640 Free PMC article. Review.
Cited by
-
Characterization of intermediate-sized insertions using whole-genome sequencing data and analysis of their functional impact on gene expression.Hum Genet. 2021 Aug;140(8):1201-1216. doi: 10.1007/s00439-021-02291-2. Epub 2021 May 12. Hum Genet. 2021. PMID: 33978893
-
Haplotype-resolved Chinese male genome assembly based on high-fidelity sequencing.Fundam Res. 2022 Mar 2;2(6):946-953. doi: 10.1016/j.fmre.2022.02.005. eCollection 2022 Nov. Fundam Res. 2022. PMID: 38933383 Free PMC article.
-
miniSNV: accurate and fast single nucleotide variant calling from nanopore sequencing data.Brief Bioinform. 2024 Sep 23;25(6):bbae473. doi: 10.1093/bib/bbae473. Brief Bioinform. 2024. PMID: 39331016 Free PMC article.
-
Somatic mutation phasing and haplotype extension using linked-reads in multiple myeloma.bioRxiv [Preprint]. 2024 Aug 10:2024.08.09.607342. doi: 10.1101/2024.08.09.607342. bioRxiv. 2024. PMID: 39149342 Free PMC article. Preprint.
-
DandD: Efficient measurement of sequence growth and similarity.iScience. 2024 Feb 1;27(3):109054. doi: 10.1016/j.isci.2024.109054. eCollection 2024 Mar 15. iScience. 2024. PMID: 38361606 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
- R01 HG002898/HG/NHGRI NIH HHS/United States
- P30 CA016672/CA/NCI NIH HHS/United States
- R01 CA172652/CA/NCI NIH HHS/United States
- U01 HG006513/HG/NHGRI NIH HHS/United States
- U41 HG007497/HG/NHGRI NIH HHS/United States
- P30 CA034196/CA/NCI NIH HHS/United States
- T32 HG002295/HG/NHGRI NIH HHS/United States
- R01 HG010169/HG/NHGRI NIH HHS/United States
- R21 AI117407/AI/NIAID NIH HHS/United States
- R01 HD081256/HD/NICHD NIH HHS/United States
- F31 HG009223/HG/NHGRI NIH HHS/United States
- WT_/Wellcome Trust/United Kingdom
- R01 HG002385/HG/NHGRI NIH HHS/United States
- R15 HG009565/HG/NHGRI NIH HHS/United States
- R01 HG005946/HG/NHGRI NIH HHS/United States
- T32 DK067872/DK/NIDDK NIH HHS/United States
- R01 MH115957/MH/NIMH NIH HHS/United States
- R01 HG007068/HG/NHGRI NIH HHS/United States
- R25 HG007153/HG/NHGRI NIH HHS/United States
- R01 CA166661/CA/NCI NIH HHS/United States
- R56 MH115957/MH/NIMH NIH HHS/United States
- T32 GM008666/GM/NIGMS NIH HHS/United States
- R01 HG008628/HG/NHGRI NIH HHS/United States
- S10 OD021644/OD/NIH HHS/United States
- U24 HG007497/HG/NHGRI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical