Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications
- PMID: 25017105
- PMCID: PMC4753679
- DOI: 10.1038/ng.3036
Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications
Abstract
High-throughput DNA sequencing technology has transformed genetic research and is starting to make an impact on clinical practice. However, analyzing high-throughput sequencing data remains challenging, particularly in clinical settings where accuracy and turnaround times are critical. We present a new approach to this problem, implemented in a software package called Platypus. Platypus achieves high sensitivity and specificity for SNPs, indels and complex polymorphisms by using local de novo assembly to generate candidate variants, followed by local realignment and probabilistic haplotype estimation. It is an order of magnitude faster than existing tools and generates calls from raw aligned read data without preprocessing. We demonstrate the performance of Platypus in clinically relevant experimental designs by comparing with SAMtools and GATK on whole-genome and exome-capture data, by identifying de novo variation in 15 parent-offspring trios with high sensitivity and specificity, and by estimating human leukocyte antigen genotypes directly from variant calls.
Figures



Similar articles
-
An analytical workflow for accurate variant discovery in highly divergent regions.BMC Genomics. 2016 Sep 2;17(1):703. doi: 10.1186/s12864-016-3045-z. BMC Genomics. 2016. PMID: 27590916 Free PMC article.
-
Comparative analysis of de novo assemblers for variation discovery in personal genomes.Brief Bioinform. 2018 Sep 28;19(5):893-904. doi: 10.1093/bib/bbx037. Brief Bioinform. 2018. PMID: 28407084 Free PMC article.
-
Impact of post-alignment processing in variant discovery from whole exome data.BMC Bioinformatics. 2016 Oct 3;17(1):403. doi: 10.1186/s12859-016-1279-z. BMC Bioinformatics. 2016. PMID: 27716037 Free PMC article.
-
Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data.Bioinformatics. 2013 Sep 15;29(18):2245-52. doi: 10.1093/bioinformatics/btt386. Epub 2013 Jul 3. Bioinformatics. 2013. PMID: 23825370 Free PMC article.
-
Genetic variation and the de novo assembly of human genomes.Nat Rev Genet. 2015 Nov;16(11):627-40. doi: 10.1038/nrg3933. Epub 2015 Oct 7. Nat Rev Genet. 2015. PMID: 26442640 Free PMC article. Review.
Cited by
-
A Novel Low-Risk Germline Variant in the SH2 Domain of the SRC Gene Affects Multiple Pathways in Familial Colorectal Cancer.J Pers Med. 2021 Apr 1;11(4):262. doi: 10.3390/jpm11040262. J Pers Med. 2021. PMID: 33916261 Free PMC article.
-
Uveal Melanoma-Derived Extracellular Vesicles Display Transforming Potential and Carry Protein Cargo Involved in Metastatic Niche Preparation.Cancers (Basel). 2020 Oct 11;12(10):2923. doi: 10.3390/cancers12102923. Cancers (Basel). 2020. PMID: 33050649 Free PMC article.
-
Community-based recruitment and exome sequencing indicates high diagnostic yield in adults with intellectual disability.Mol Genet Genomic Med. 2020 Oct;8(10):e1439. doi: 10.1002/mgg3.1439. Epub 2020 Aug 7. Mol Genet Genomic Med. 2020. PMID: 32767738 Free PMC article.
-
Leveraging the power of high performance computing for next generation sequencing data analysis: tricks and twists from a high throughput exome workflow.PLoS One. 2015 May 5;10(5):e0126321. doi: 10.1371/journal.pone.0126321. eCollection 2015. PLoS One. 2015. PMID: 25942438 Free PMC article.
-
Accurate determination of node and arc multiplicities in de bruijn graphs using conditional random fields.BMC Bioinformatics. 2020 Sep 14;21(1):402. doi: 10.1186/s12859-020-03740-x. BMC Bioinformatics. 2020. PMID: 32928110 Free PMC article.
References
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases