Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul 23;8(7):e70388.
doi: 10.1371/journal.pone.0070388. Print 2013.

PCR-induced transitions are the major source of error in cleaned ultra-deep pyrosequencing data

Affiliations

PCR-induced transitions are the major source of error in cleaned ultra-deep pyrosequencing data

Johanna Brodin et al. PLoS One. .

Abstract

Background: Ultra-deep pyrosequencing (UDPS) is used to identify rare sequence variants. The sequence depth is influenced by several factors including the error frequency of PCR and UDPS. This study investigated the characteristics and source of errors in raw and cleaned UDPS data.

Results: UDPS of a 167-nucleotide fragment of the HIV-1 SG3Δenv plasmid was performed on the Roche/454 platform. The plasmid was diluted to one copy, PCR amplified and subjected to bidirectional UDPS on three occasions. The dataset consisted of 47,693 UDPS reads. Raw UDPS data had an average error frequency of 0.30% per nucleotide site. Most errors were insertions and deletions in homopolymeric regions. We used a cleaning strategy that removed almost all indel errors, but had little effect on substitution errors, which reduced the error frequency to 0.056% per nucleotide. In cleaned data the error frequency was similar in homopolymeric and non-homopolymeric regions, but varied considerably across sites. These site-specific error frequencies were moderately, but still significantly, correlated between runs (r=0.15-0.65) and between forward and reverse sequencing directions within runs (r=0.33-0.65). Furthermore, transition errors were 48-times more common than transversion errors (0.052% vs. 0.001%; p<0.0001). Collectively the results indicate that a considerable proportion of the sequencing errors that remained after data cleaning were generated during the PCR that preceded UDPS.

Conclusions: A majority of the sequencing errors that remained after data cleaning were introduced by PCR prior to sequencing, which means that they will be independent of platform used for next-generation sequencing. The transition vs. transversion error bias in cleaned UDPS data will influence the detection limits of rare mutations and sequence variants.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Examples of how different types of UDPS error were defined.
Figure 2
Figure 2. The average frequency of different substitution errors in percent (%) in cleaned UDPS data from three sequencing runs.
Thick arrows indicate transitions and thin arrows indicate transversions.
Figure 3
Figure 3. Site-specific error frequencies in percent (%) in cleaned UDPS data obtained in the forward sequencing direction of run 1.
All sequencing errors were substitutions since all deletions and insertions were removed by the data cleaning procedure. The bars are color-coded according to the type of substitution error. Homopolymeric regions are shaded.
Figure 4
Figure 4. PCR/UDPS error ratio in our cleaned data.
This figure shows a comparison of the counts of reverse to forward variants of run 1. A) Our filtered UDPS data. B) Same data, normalized by the main variant forward and reverse counts.

References

    1. Margeridon-Thermet S, Shulman NS, Ahmed A, Shahriar R, Liu T, et al. (2009) Ultra-deep pyrosequencing of hepatitis B virus quasispecies from nucleoside and nucleotide reverse-transcriptase inhibitor (NRTI)-treated patients and NRTI-naive patients. J Infect Dis 199: 1275–1285. - PMC - PubMed
    1. Simen BB, Simons JF, Hullsiek KH, Novak RM, Macarthur RD, et al. (2009) Low-abundance drug-resistant viral variants in chronically HIV-infected, antiretroviral treatment-naive patients significantly impact treatment outcomes. J Infect Dis 199: 693–701. - PubMed
    1. Hoffmann C, Minkah N, Leipzig J, Wang G, Arens MQ, et al. (2007) DNA bar coding and pyrosequencing to identify rare HIV drug resistance mutations. Nucleic Acids Res 35: e91. - PMC - PubMed
    1. Hedskog C, Mild M, Jernberg J, Sherwood E, Bratt G, et al. (2010) Dynamics of HIV-1 quasispecies during antiviral treatment dissected using ultra-deep pyrosequencing. PLoS One 5: e11345. - PMC - PubMed
    1. Hirsch MS, Gunthard HF, Schapiro JM, Brun-Vezinet F, Clotet B, et al. (2008) Antiretroviral drug resistance testing in adult HIV-1 infection: 2008 recommendations of an International AIDS Society-USA panel. Clin Infect Dis 47: 266–285. - PubMed

Publication types

MeSH terms