Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes
- PMID: 19447966
- PMCID: PMC2704429
- DOI: 10.1101/gr.088633.108
Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes
Abstract
Recent studies show that along with single nucleotide polymorphisms and small indels, larger structural variants among human individuals are common. The Human Genome Structural Variation Project aims to identify and classify deletions, insertions, and inversions (>5 Kbp) in a small number of normal individuals with a fosmid-based paired-end sequencing approach using traditional sequencing technologies. The realization of new ultra-high-throughput sequencing platforms now makes it feasible to detect the full spectrum of genomic variation among many individual genomes, including cancer patients and others suffering from diseases of genomic origin. Unfortunately, existing algorithms for identifying structural variation (SV) among individuals have not been designed to handle the short read lengths and the errors implied by the "next-gen" sequencing (NGS) technologies. In this paper, we give combinatorial formulations for the SV detection between a reference genome sequence and a next-gen-based, paired-end, whole genome shotgun-sequenced individual. We describe efficient algorithms for each of the formulations we give, which all turn out to be fast and quite reliable; they are also applicable to all next-gen sequencing methods (Illumina, 454 Life Sciences [Roche], ABI SOLiD, etc.) and traditional capillary sequencing technology. We apply our algorithms to identify SV among individual genomes very recently sequenced by Illumina technology.
Figures


Similar articles
-
Robust and exact structural variation detection with paired-end and soft-clipped alignments: SoftSV compared with eight algorithms.Brief Bioinform. 2016 Jan;17(1):51-62. doi: 10.1093/bib/bbv028. Epub 2015 May 20. Brief Bioinform. 2016. PMID: 25998133
-
Sensitive and accurate detection of copy number variants using read depth of coverage.Genome Res. 2009 Sep;19(9):1586-92. doi: 10.1101/gr.092981.109. Epub 2009 Aug 5. Genome Res. 2009. PMID: 19657104 Free PMC article.
-
Simultaneous structural variation discovery among multiple paired-end sequenced genomes.Genome Res. 2011 Dec;21(12):2203-12. doi: 10.1101/gr.120501.111. Epub 2011 Nov 2. Genome Res. 2011. PMID: 22048523 Free PMC article.
-
Massively parallel sequencing approaches for characterization of structural variation.Methods Mol Biol. 2012;838:369-84. doi: 10.1007/978-1-61779-507-7_18. Methods Mol Biol. 2012. PMID: 22228022 Free PMC article. Review.
-
Whole genome sequencing.Methods Mol Biol. 2010;628:215-26. doi: 10.1007/978-1-60327-367-1_12. Methods Mol Biol. 2010. PMID: 20238084 Review.
Cited by
-
Allele-Specific Quantification of Structural Variations in Cancer Genomes.Cell Syst. 2016 Jul;3(1):21-34. doi: 10.1016/j.cels.2016.05.007. Epub 2016 Jul 21. Cell Syst. 2016. PMID: 27453446 Free PMC article.
-
Detection of Genomic Structural Variants from Next-Generation Sequencing Data.Front Bioeng Biotechnol. 2015 Jun 25;3:92. doi: 10.3389/fbioe.2015.00092. eCollection 2015. Front Bioeng Biotechnol. 2015. PMID: 26161383 Free PMC article. Review.
-
Allelic decomposition and exact genotyping of highly polymorphic and structurally variant genes.Nat Commun. 2018 Feb 26;9(1):828. doi: 10.1038/s41467-018-03273-1. Nat Commun. 2018. PMID: 29483503 Free PMC article.
-
VNTRseek-a computational tool to detect tandem repeat variants in high-throughput sequencing data.Nucleic Acids Res. 2014 Aug;42(14):8884-94. doi: 10.1093/nar/gku642. Epub 2014 Jul 23. Nucleic Acids Res. 2014. PMID: 25056320 Free PMC article.
-
The impact and origin of copy number variations in the Oryza species.BMC Genomics. 2016 Mar 29;17:261. doi: 10.1186/s12864-016-2589-2. BMC Genomics. 2016. PMID: 27025496 Free PMC article.
References
-
- Batzer M, Arcot S, Phinney J, Alegria-Hartman M, Kass D, Milligan S, Kimpton C, Gill P, Hochmeister M, Panayiotis A, et al. Genetic variation of recent Alu insertions in the human populations. J Mol Evol. 1996;42:22–29. - PubMed
-
- Boissinot S, Chevret P, Furano AV. L1 (LINE-1) retrotransposon evolution and amplification in recent human history. Mol Biol Evol. 2000;17:915–928. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources