A multi-platform reference for somatic structural variation detection
- PMID: 36778136
- PMCID: PMC9903816
- DOI: 10.1016/j.xgen.2022.100139
A multi-platform reference for somatic structural variation detection
Abstract
Accurate detection of somatic structural variation (SV) in cancer genomes remains a challenging problem. This is in part due to the lack of high-quality, gold-standard datasets that enable the benchmarking of experimental approaches and bioinformatic analysis pipelines. Here, we performed somatic SV analysis of the paired melanoma and normal lymphoblastoid COLO829 cell lines using four different sequencing technologies. Based on the evidence from multiple technologies combined with extensive experimental validation, we compiled a comprehensive set of carefully curated and validated somatic SVs, comprising all SV types. We demonstrate the utility of this resource by determining the SV detection performance as a function of tumor purity and sequence depth, highlighting the importance of assessing these parameters in cancer genomics projects. The truth somatic SV dataset as well as the underlying raw multi-platform sequencing data are freely available and are an important resource for community somatic benchmarking efforts.
Keywords: benchmarking; cancer; long sequencing read; short sequencing read; structural variant; truth set; whole-genome sequencing.
© 2022 The Author(s).
Conflict of interest statement
A.M.W. is an employee and shareholder of Pacific Biosciences. W.P.K. is an employee and shareholder of Cyclomics B.V.
Figures
References
-
- Cortés-Ciriano I., Lee J.J.K., Xi R., Jain D., Jung Y.L., Yang L., Gordenin D., Klimczak L.J., Zhang C.Z., Pellman D.S., PCAWG Structural Variation Working Group. Park P.J., PCAWG Consortium Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing. Nat. Genet. 2020;52:331–341. doi: 10.1038/s41588-019-0576-7. - DOI - PMC - PubMed
LinkOut - more resources
Full Text Sources
