A universal SNP and small-indel variant caller using deep neural networks
- PMID: 30247488
- DOI: 10.1038/nbt.4235
A universal SNP and small-indel variant caller using deep neural networks
Abstract
Despite rapid advances in sequencing technologies, accurately calling genetic variants present in an individual genome from billions of short, errorful sequence reads remains challenging. Here we show that a deep convolutional neural network can call genetic variation in aligned next-generation sequencing read data by learning statistical relationships between images of read pileups around putative variant and true genotype calls. The approach, called DeepVariant, outperforms existing state-of-the-art tools. The learned model generalizes across genome builds and mammalian species, allowing nonhuman sequencing projects to benefit from the wealth of human ground-truth data. We further show that DeepVariant can learn to call variants in a variety of sequencing technologies and experimental designs, including deep whole genomes from 10X Genomics and Ion Ampliseq exomes, highlighting the benefits of using more automated and generalizable techniques for variant calling.
Similar articles
-
dv-trio: a family-based variant calling pipeline using DeepVariant.Bioinformatics. 2020 Jun 1;36(11):3549-3551. doi: 10.1093/bioinformatics/btaa116. Bioinformatics. 2020. PMID: 32315409
-
Lean and deep models for more accurate filtering of SNP and INDEL variant calls.Bioinformatics. 2020 Apr 1;36(7):2060-2067. doi: 10.1093/bioinformatics/btz901. Bioinformatics. 2020. PMID: 31830260
-
A multi-task convolutional deep neural network for variant calling in single molecule sequencing.Nat Commun. 2019 Mar 1;10(1):998. doi: 10.1038/s41467-019-09025-z. Nat Commun. 2019. PMID: 30824707 Free PMC article.
-
Toward better understanding of artifacts in variant calling from high-coverage samples.Bioinformatics. 2014 Oct 15;30(20):2843-51. doi: 10.1093/bioinformatics/btu356. Epub 2014 Jun 27. Bioinformatics. 2014. PMID: 24974202 Free PMC article. Review.
-
Deep learning of genomic variation and regulatory network data.Hum Mol Genet. 2018 May 1;27(R1):R63-R71. doi: 10.1093/hmg/ddy115. Hum Mol Genet. 2018. PMID: 29648622 Free PMC article. Review.
Cited by
-
Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes.Nat Biotechnol. 2020 Sep;38(9):1044-1053. doi: 10.1038/s41587-020-0503-6. Epub 2020 May 4. Nat Biotechnol. 2020. PMID: 32686750 Free PMC article.
-
From molecules to genomic variations: Accelerating genome analysis via intelligent algorithms and architectures.Comput Struct Biotechnol J. 2022 Aug 18;20:4579-4599. doi: 10.1016/j.csbj.2022.08.019. eCollection 2022. Comput Struct Biotechnol J. 2022. PMID: 36090814 Free PMC article. Review.
-
Resistance of SARS-CoV-2 variants to neutralization by monoclonal and serum-derived polyclonal antibodies.Nat Med. 2021 Apr;27(4):717-726. doi: 10.1038/s41591-021-01294-w. Epub 2021 Mar 4. Nat Med. 2021. PMID: 33664494 Free PMC article.
-
Exploiting public databases of genomic variation to quantify evolutionary constraint on the branch point sequence in 30 plant and animal species.Nucleic Acids Res. 2023 Dec 11;51(22):12069-12075. doi: 10.1093/nar/gkad970. Nucleic Acids Res. 2023. PMID: 37953306 Free PMC article.
-
How artificial intelligence might disrupt diagnostics in hematology in the near future.Oncogene. 2021 Jun;40(25):4271-4280. doi: 10.1038/s41388-021-01861-y. Epub 2021 Jun 8. Oncogene. 2021. PMID: 34103684 Free PMC article. Review.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous