Characterization of subclonal variants in HG002 Genome in a Bottle reference material as a resource for benchmarking variant callers
- PMID: 41421359
- DOI: 10.1016/j.xgen.2025.101104
Characterization of subclonal variants in HG002 Genome in a Bottle reference material as a resource for benchmarking variant callers
Abstract
We developed a benchmark set of subclonal variants in the Genome in a Bottle (GIAB) Consortium HG002 reference material (RM) DNA for evaluating lower-frequency variant callsets. We used a somatic variant caller with high-coverage (300×) whole-genome sequencing data from the GIAB Ashkenazi Jewish trio to identify potential subclonal variants in the HG002 RM DNA. Using orthogonal sequencing data and manual curation, we defined a benchmark set with 85 high-confidence subclonal single-nucleotide variants (SNVs) (allele frequency [AF] > 5%) and a benchmark region covering 2.45 Gbp of the autosomes. External validation supported that it can be used to reliably identify both false negatives and false positives for a variety of sequencing technologies and variant callers. By adding our characterization of mosaic SNVs in this widely used cell line, we have expanded the scope of bioinformatic and sequencing applications for which the HG002 GIAB RM can be used to include benchmarking subclonal SNVs.
Keywords: Genome In A Bottle; SNV; genome sequencing; mosaic variant; reference material; somatic mosaicism; somatic variant; variant benchmarking; variant calling.
Published by Elsevier Inc.
Conflict of interest statement
Declaration of interests C.E.M. is a co-founder of Onegevity. Y.W., M.R., A.V., L.M., W.-T.C., S.C., J.H., R.M., and G.P. are Illumina employees and equity owners. A.C., P.-C.C., K. Shafin, D.C., A.K., and L.B. are employees of Google LLC and receive equity compensation. P.C.B. sits on the scientific advisory boards of Intersect Diagnostics, Inc., and BioSymetrics, Inc., and previously sat on that of Sage Bionetworks.
LinkOut - more resources
Full Text Sources
Research Materials
