Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013:3:2161.
doi: 10.1038/srep02161.

A practical method to detect SNVs and indels from whole genome and exome sequencing data

Affiliations

A practical method to detect SNVs and indels from whole genome and exome sequencing data

Daichi Shigemizu et al. Sci Rep. 2013.

Abstract

The recent development of massively parallel sequencing technology has allowed the creation of comprehensive catalogs of genetic variation. However, due to the relatively high sequencing error rate for short read sequence data, sophisticated analysis methods are required to obtain high-quality variant calls. Here, we developed a probabilistic multinomial method for the detection of single nucleotide variants (SNVs) as well as short insertions and deletions (indels) in whole genome sequencing (WGS) and whole exome sequencing (WES) data for single sample calling. Evaluation with DNA genotyping arrays revealed a concordance rate of 99.98% for WGS calls and 99.99% for WES calls. Sanger sequencing of the discordant calls determined the false positive and false negative rates for the WGS (0.0068% and 0.17%) and WES (0.0036% and 0.0084%) datasets. Furthermore, short indels were identified with high accuracy (WGS: 94.7%, WES: 97.3%). We believe our method can contribute to the greater understanding of human diseases.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Read depth per nucleotide and GC content.
(a) Distribution of read depth in WGS and WES on-target regions. (b) Distribution of GC content of WES on-target regions.
Figure 2
Figure 2. Common indels identified by VCMM, GATK and SAMtools.
(a) SNV in WGS. SNVs in repeat regions and unknown contigs were not used for the comparison. (b) Indel in WGS. Indels in repeat regions and unknown contigs were not used for the comparison. (c) SNV in WES. (d) Coding indel in WES.

References

    1. Londin E. R. et al. Whole-exome sequencing of DNA from peripheral blood mononuclear cells (PBMC) and EBV-transformed lymphocytes from the same donor. BMC Genomics 12, 464 (2011). - PMC - PubMed
    1. Ng S. B. et al. Exome sequencing identifies the cause of a mendelian disorder. Nat Genet 42, 30–5 (2010). - PMC - PubMed
    1. Rusk N. & Kiermer V. Primer: Sequencing--the next generation. Nat Methods 5, 15 (2008). - PubMed
    1. Metzker M. L. Sequencing technologies - the next generation. Nat Rev Genet 11, 31–46 (2010). - PubMed
    1. Mardis E. R. Next-generation DNA sequencing methods. Annu Rev Genomics Hum Genet 9, 387–402 (2008). - PubMed

Publication types