Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jul;6(7):1440-4.
doi: 10.1038/ismej.2011.208. Epub 2012 Jan 12.

Selection of primers for optimal taxonomic classification of environmental 16S rRNA gene sequences

Affiliations

Selection of primers for optimal taxonomic classification of environmental 16S rRNA gene sequences

David A W Soergel et al. ISME J. 2012 Jul.

Abstract

Microbial community profiling using 16S rRNA gene sequences requires accurate taxonomy assignments. 'Universal' primers target conserved sequences and amplify sequences from many taxa, but they provide variable coverage of different environments, and regions of the rRNA gene differ in taxonomic informativeness--especially when high-throughput short-read sequencing technologies (for example, 454 and Illumina) are used. We introduce a new evaluation procedure that provides an improved measure of expected taxonomic precision when classifying environmental sequence reads from a given primer. Applying this measure to thousands of combinations of primers and read lengths, simulating single-ended and paired-end sequencing, reveals that these choices greatly affect taxonomic informativeness. The most informative sequence region may differ by environment, partly due to variable coverage of different environments in reference databases. Using our Rtax method of classifying paired-end reads, we found that paired-end sequencing provides substantial benefit in some environments including human gut, but not in others. Optimal primer choice for short reads totaling 96 nt provides 82-100% of the confident genus classifications available from longer reads.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Classification performance, at three levels of estimated accuracy (Supplementary Methods), of 6617 possible choices of amplification primer, sequencing primer and read length for single-ended reads from different environments (left portion of each panel) and 3061 possible choices of primer pair and read length for paired-end reads (right portion). Combinations of primers and read lengths are sorted on the x axis according to a measure of overall classification performance (Supplementary Methods). Stacked bars show the proportion of non-chimeric, non-unique sequences from each sample—not the proportion of the total sample—that can be classified to each taxonomic level for each combination. See Supplementary Figure S1 and Supplementary Table S1 for the excluded proportion of novel (and thus a priori unclassifiable) sequences in each sample. The top of each colored section indicates how much of the sample can be classified to the given level or better. ‘Primer miss' (black) indicates sequences that did not match a given primer and so would not be amplified. Classifications more specific than the genus level are exceedingly rare and so are not visible here. Horizontal lines indicate the maximum proportion of each sample classifiable to the genus level using 96 nt or less of sequence (i.e., with an optimal choice of primer or primer pair; see also Supplementary Tables S4 and S5), showing that short reads from the best primers frequently—but not always—provide taxonomic information nearly matching that obtained from longer read lengths. Full-size versions of these panels are available in the supplementary data.

Similar articles

Cited by

References

    1. Acinas SG, Klepac-Ceraj V, Hunt DE, Pharino C, Ceraj I, Distel DL, et al. Fine-scale phylogenetic architecture of a complex bacterial community. Nature. 2004;430:551–554. - PubMed
    1. Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Lozupone CA, Turnbaugh PJ, et al. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc Natl Acad Sci USA. 2011;108 (Suppl 1:4516–4522. - PMC - PubMed
    1. Claesson MJ, Wang Q, O'Sullivan O, Greene-Diniz R, Cole JR, Ross RP, et al. Comparison of two next-generation sequencing technologies for resolving highly complex microbiota composition using tandem variable 16S rRNA gene regions. Nucleic Acids Res. 2010;38:e200. - PMC - PubMed
    1. Desantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, Keller K, et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol. 2006;72:5069–5072. - PMC - PubMed
    1. Degnan PH, Ochman H. Illumina-based analysis of microbial community diversity. The ISME J. 2012;6:183–194. - PMC - PubMed

Publication types

MeSH terms