Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Oct 23;7(1):13751.
doi: 10.1038/s41598-017-13116-6.

High-frequency, low-coverage "false positives" mutations may be true in GS Junior sequencing studies

Affiliations

High-frequency, low-coverage "false positives" mutations may be true in GS Junior sequencing studies

Zhiliang Yang et al. Sci Rep. .

Abstract

The GS Junior sequencer provides simplified procedures for library preparation and data processing. Errors in pyrosequencing generate some biases during library construction and emulsion PCR amplification. False-positive mutations are identified by related characteristics described in the manufacturer's manual, and some detected mutations may have 'borderline' characteristics when they are detected in few reads or at low frequency. Among these mutations, however, some may be true positives. This study aimed to improve the accuracy of identifying true positives among mutations with borderline false-positive characteristics detected with GS Junior sequencing. Mutations with the borderline features were tested for validity with Sanger sequencing. We examined 10 mutations detected in coverages <20-fold at frequencies >30% (group A) and 16 mutations detected in coverages >20-fold at frequencies < 30% (group B). In group A, two mutations were not confirmed, and two mutations with 100% frequency were confirmed as heterozygous alleles. No mutation in group B was confirmed. The two groups had significantly different false-positive prevalences (p = 0.001). These results suggest that mutations detected at frequencies less than 30% can be confidently identified as false-positives but that mutations detected at frequencies over 30%, despite coverages less than 20-fold, should be verified with Sanger sequencing.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1
The electropherograms of representative A7. (A) The read image from the GS Junior system; the sequence is shown in sense sequence. There were three reads, and the reads showed a homologous mutation. A contig (from contiguous) is a set of overlapping DNA segments that together represent a consensus region of DNA. Contig 00053 was the GenBank accession number in the software for the reads. (B) The electropherogram from Sanger sequencing, and the sequence is shown in sense sequence. The mutation was verified to be heterozygous.
Figure 2
Figure 2
The electropherograms of representative A3. (A) The read image from the GS Junior system; the reads are shown in antisense sequences. There were six mutated reads among nine reads, and the mutations were considered false mutations according the characteristic criteria from the user manual. Contig 00034 was the GenBank accession number in the software for the reads. (B) The electropherogram from Sanger sequencing, and the sequence is shown in sense sequence. The mutation was verified as heterozygous.
Figure 3
Figure 3
Possible mechanisms of false-read generation during library construction and emulsion PCR. One bead should be one unique read. (A) Clonal amplification of single DNA on one bead in one droplet, which would result in a unique read. If capture beads did not capture enough fragments or the beads were washed off for insufficient amplification, the result may be few final reads. (B) Some mistaken fragments generated during library construction may be captured and amplified, whereas the true fragments of the same position may fail to be amplified, which could result in some kind of high-frequency false positive. If both artificial and natural duplicates were captured and amplified, the result can be some type of low-frequency false positives. (C) If several fragments were captured by one bead in one droplet or some droplets were broken during the process of emPCR, the result also may be a false positive with low frequency.

Similar articles

Cited by

References

    1. Brockman W, et al. Quality scores and SNP detection in sequencing-by-synthesis systems. Genome Res. 2008;18:763–770. doi: 10.1101/gr.070227.107. - DOI - PMC - PubMed
    1. Goossens D, et al. Simultaneous mutation and copy number variation (CNV) detection by multiplex PCR-based GS-FLX sequencing. Hum Mutat. 2009;30:472–476. doi: 10.1002/humu.20873. - DOI - PubMed
    1. Smith DR, et al. Rapid whole-genome mutational profiling using next-generation sequencing technologies. Genome Res. 2008;18:1638–1642. doi: 10.1101/gr.077776.108. - DOI - PMC - PubMed
    1. Chou LS, Liu CS, Boese B, Zhang X, Mao R. DNA sequence capture and enrichment by microarray followed by next-generation sequencing for targeted resequencing: neurofibromatosis type 1 gene as a model. Clin Chem. 2010;56:62–72. doi: 10.1373/clinchem.2009.132639. - DOI - PubMed
    1. de Leeneer K, et al. Practical tools to implement massive parallel pyrosequencing of PCR products in next generation molecular diagnostics. PLoS One. 2011;6:e25531. doi: 10.1371/journal.pone.0025531. - DOI - PMC - PubMed