Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 May 1;31(9):1469-71.
doi: 10.1093/bioinformatics/btu828. Epub 2014 Dec 17.

VarSim: a high-fidelity simulation and validation framework for high-throughput genome sequencing with cancer applications

Affiliations

VarSim: a high-fidelity simulation and validation framework for high-throughput genome sequencing with cancer applications

John C Mu et al. Bioinformatics. .

Abstract

Summary: VarSim is a framework for assessing alignment and variant calling accuracy in high-throughput genome sequencing through simulation or real data. In contrast to simulating a random mutation spectrum, it synthesizes diploid genomes with germline and somatic mutations based on a realistic model. This model leverages information such as previously reported mutations to make the synthetic genomes biologically relevant. VarSim simulates and validates a wide range of variants, including single nucleotide variants, small indels and large structural variants. It is an automated, comprehensive compute framework supporting parallel computation and multiple read simulators. Furthermore, we developed a novel map data structure to validate read alignments, a strategy to compare variants binned in size ranges and a lightweight, interactive, graphical report to visualize validation results with detailed statistics. Thus far, it is the most comprehensive validation tool for secondary analysis in next generation sequencing.

Availability and implementation: Code in Java and Python along with instructions to download the reads and variants is at http://bioinform.github.io/varsim.

Contact: rd@bina.com

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
VarSim simulation and validation workflow. The germline workflow can be run with or without the somatic workflow
Fig. 2.
Fig. 2.
Validation results for some popular secondary analysis tools

References

    1. Abyzov A., et al. . (2011) Cnvnator: an approach to discover, genotype, and characterize typical and atypical cnvs from family and population genome sequencing. Genome Res. , 21, 974–984. - PMC - PubMed
    1. Bartenhagen C., Dugas M. (2013) RSVSim: an R/Bioconductor package for the simulation of structural variations. Bioinformatics , 29, 1679–1681. - PubMed
    1. Chen K., et al. (2009) BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat. Methods , 6, 677–681. - PMC - PubMed
    1. Cibulskis K., et al. (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. , 31, 213–219. - PMC - PubMed
    1. Forbes S.A., et al. (2014) COSMIC: exploring the world’s knowledge of somatic mutations in human cancer. Nucleic Acids Res, 43, D805–D811. - PMC - PubMed

Publication types