Accurate somatic small variant discovery for multiple sequencing technologies with DeepSomatic
- PMID: 41102444
- DOI: 10.1038/s41587-025-02839-x
Accurate somatic small variant discovery for multiple sequencing technologies with DeepSomatic
Abstract
Somatic variant detection is an integral part of cancer genomics analysis. While most methods have focused on short-read sequencing, long-read technologies offer potential advantages in repeat mapping and variant phasing. We present DeepSomatic, a deep-learning method for detecting somatic small nucleotide variations and insertions and deletions from both short-read and long-read data. The method has modes for whole-genome and whole-exome sequencing and can run on tumor-normal, tumor-only and formalin-fixed paraffin-embedded samples. To train DeepSomatic and help address the dearth of publicly available training and benchmarking data for somatic variant detection, we generated and make openly available the Cancer Standards Long-read Evaluation (CASTLE) dataset of six matched tumor-normal cell line pairs whole-genome sequenced with Illumina, PacBio HiFi and Oxford Nanopore Technologies, along with benchmark variant sets. Across samples, both cell line and patient-derived, and across short-read and long-read sequencing technologies, DeepSomatic consistently outperforms existing callers.
© 2025. The Author(s), under exclusive licence to Springer Nature America, Inc.
Conflict of interest statement
Competing interests: K.S., D.E.C., P.-C.C., A. Kolesnikov, L.B., J.C.M. and A.C. are employees of Google LLC and own Alphabet stock as part of the standard compensation package. M.S.F. is a part of the speakers bureau for Bayer and PacBio. The remaining authors declare no competing interests.
Update of
-
DeepSomatic: Accurate somatic small variant discovery for multiple sequencing technologies.bioRxiv [Preprint]. 2024 Aug 19:2024.08.16.608331. doi: 10.1101/2024.08.16.608331. bioRxiv. 2024. Update in: Nat Biotechnol. 2025 Oct 16. doi: 10.1038/s41587-025-02839-x. PMID: 39229187 Free PMC article. Updated. Preprint.
References
Grants and funding
- U41HG010972/U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- U24HG011853/U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- OT2OD033761/U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01HG010485/U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- U41HG010972/U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- U24HG011853/U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- OT2OD033761/U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- U01CA253405/U.S. Department of Health & Human Services | NIH | National Cancer Institute (NCI)
- U01CA253405/U.S. Department of Health & Human Services | NIH | National Cancer Institute (NCI)
LinkOut - more resources
Full Text Sources
