Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2015;3(4):158-165.
doi: 10.1007/s40142-015-0076-8. Epub 2015 Sep 4.

Understanding the Basics of NGS: From Mechanism to Variant Calling

Affiliations
Review

Understanding the Basics of NGS: From Mechanism to Variant Calling

Dale Muzzey et al. Curr Genet Med Rep. 2015.

Abstract

Identifying disease-causing mutations in DNA has long been the goal of genetic medicine. In the last decade, the toolkit for discovering DNA variants has undergone rapid evolution: mutations that were historically discovered by analog approaches like Sanger sequencing and multiplex ligation-dependent probe amplification ("MLPA") can now be decoded from a digital signal with next-generation sequencing ("NGS"). Given the explosive growth of NGS-based tests in the clinic, it is of the utmost importance that medical practitioners have a fundamental understanding of the newest NGS methodologies. To that end, here we provide a very basic overview of how NGS works, with particular emphasis on the close resemblance between the underlying chemistry of Sanger sequencing and NGS. Using a pair of simple analogies, we develop an intuitive framework for understanding how high-confidence detection of single-nucleotide polymorphisms, indels, and large deletions/duplications is possible with NGS alone.

Keywords: Del/dup calling; Next-generation sequencing (NGS); Read depth; SNP/indel calling; Variant calling.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
NGS is a slightly modified, digital, and vastly scaled-up implementation of Sanger sequencing. In both methodologies, a polymerase copies template molecules by incorporating nucleotides from a pool, that is, either partially (Sanger) or entirely (NGS) composed of dyed and unextendable bases. Extension, arrangement, and detection are shared steps in both protocols but occur in different order, with NGS alone having a restoration step that converts bases to the undyed and extendable form
Fig. 2
Fig. 2
High-confidence SNP and indel calls possible from NGS data with >20× depth. a SNPs and indels are conspicuous from NGS data after the reads (gray; each read is 28 bases long) are aligned to the reference genome (excerpted in black), and the confidence of each call depends on the depth at that position. b The three potential genotypes for a simple diploid variant are represented as different types of coins (top). A referee who lies about the coin-flip outcome 1 % of the time reports the results of 20 successive flips for three different coins (iiii); the probability that the referee selected each type of coin is indicated after 2, 5, 10, and 20 flips, with the coin at right being the one with maximum probability. The probabilities indicated before the coin is flipped assume the coins model a genomic variant with 50 % minor allele frequency (“MAF”). c (i) Call confidence as a function of respective read depth for reference and alternate bases is shown, where gray regions have confidence <99.9999 %, and the three-colored regions have >99.9999 % confidence in homozygous reference (red), heterozygous (green), and homozygous alternate (yellow) calls. (ii) Each point shows the reference-versus-alternate read depth across sites with MAF ≥45 % in a typical targeted NGS experiment
Fig. 3
Fig. 3
Del/dup calling from NGS data requires simple and intuitive processing of raw data. a Schematic of MLPA (top) and NGS (bottom) data for a sample in which one chromosome is normal and the other has a deletion. MLPA probes have a genome-binding sequence (shades of green), stuffer sequence to give them unique length (black), and binding sites for common primers (red) at the termini that enable multiplex amplification. For NGS, read depth can be pooled across a region (as depicted) or counted at a single site. In addition to depth data supporting a del/dup, NGS provides evidence of junction reads that further support the observation of a del/dup. For clarity, the ligation step that fuses two DNA fragments into the probes depicted in the figure is omitted. b A chocolate store that underperforms relative to others is revealed by dividing (i) the hypothetical annual sales volume for each store by its average (yielding ii) and then dividing once more by the monthly average across stores (giving iii). c Multiple samples with del/dups in the HBA locus are discovered by normalizing (i) the raw depth data across many sites by the sample average and then by the site average (yielding ii, where del/dup samples have thick traces)

References

    1. National Research Council (U.S.), Committee on Mapping and Sequencing the Human Genome, Alberts B. Report of the Committee on Mapping and Sequencing the Human Genome. Washington, DC: National Academies; 1988.
    1. •• Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, et al. Initial sequencing and analysis of the human genome. Nature. Nature Publishing Group; 2001;409:860–921. This landmark paper announced the release of the nearly complete human genome sequence. It describes the technical aspects of sequencing and assembling the genome, and it presents analysis of the properties and large-scale trends in the sequence. - PubMed
    1. International Human Genome Sequencing Consortium Finishing the euchromatic sequence of the human genome. Nature. 2004;431:931–945. doi: 10.1038/nature03001. - DOI - PubMed
    1. National Human Genome Research Institute. The Human Genome Project Completion: frequently asked questions [Internet]. genome.gov. https://www.genome.gov/11006943. Cited 29 April 2015.
    1. Metzker ML. Sequencing technologies—the next generation. Nat Rev Genet. 2010;11:31–46. doi: 10.1038/nrg2626. - DOI - PubMed