Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Mar 21:15:66.
doi: 10.1186/s12866-015-0351-6.

The truth about metagenomics: quantifying and counteracting bias in 16S rRNA studies

Affiliations

The truth about metagenomics: quantifying and counteracting bias in 16S rRNA studies

J Paul Brooks et al. BMC Microbiol. .

Abstract

Background: Characterizing microbial communities via next-generation sequencing is subject to a number of pitfalls involving sample processing. The observed community composition can be a severe distortion of the quantities of bacteria actually present in the microbiome, hampering analysis and threatening the validity of conclusions from metagenomic studies. We introduce an experimental protocol using mock communities for quantifying and characterizing bias introduced in the sample processing pipeline. We used 80 bacterial mock communities comprised of prescribed proportions of cells from seven vaginally-relevant bacterial strains to assess the bias introduced in the sample processing pipeline. We created two additional sets of 80 mock communities by mixing prescribed quantities of DNA and PCR product to quantify the relative contribution to bias of (1) DNA extraction, (2) PCR amplification, and (3) sequencing and taxonomic classification for particular choices of protocols for each step. We developed models to predict the "true" composition of environmental samples based on the observed proportions, and applied them to a set of clinical vaginal samples from a single subject during four visits.

Results: We observed that using different DNA extraction kits can produce dramatically different results but bias is introduced regardless of the choice of kit. We observed error rates from bias of over 85% in some samples, while technical variation was very low at less than 5% for most bacteria. The effects of DNA extraction and PCR amplification for our protocols were much larger than those due to sequencing and classification. The processing steps affected different bacteria in different ways, resulting in amplified and suppressed observed proportions of a community. When predictive models were applied to clinical samples from a subject, the predicted microbiome profiles were better reflections of the physiology and diagnosis of the subject at the visits than the observed community compositions.

Conclusions: Bias in 16S studies due to DNA extraction and PCR amplification will continue to require attention despite further advances in sequencing technology. Analysis of mock communities can help assess bias and facilitate the interpretation of results from environmental samples.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic of three mixture experiments and observed results. In Experiment 1, bacterial cultures were mixed so that communities were comprised of equal numbers of cells. In Experiment 2, DNA was extracted from pure bacterial cultures and then mixed so that communities were comprised of equal amounts of DNA. In Experiment 3, DNA was extracted from pure bacterial cultures and subjected to PCR and PCR product was mixed so that communities are comprised of equal amounts of PCR product. The pie charts in the bottom row are the observed results for a sample that consisted of equal proportions of seven bacteria for each experiment. The pie charts in the other rows represent the prescribed mixing ratios (each slice is of equal size). Key: red - G. vaginalis, orange - S. agalactiae, purple - S. amnii,green - P. bivia, light blue - L. iners, yellow - L. crispatus, brown - A. vaginae.
Figure 2
Figure 2
Observed bias by bacterium. The observed bias (the observed minus the actual proportions) for each bacterium in the experimental design due to the different effects of our DNA Extraction, PCR amplification, and sequencing and taxonomic classification protocols. The total bias is also plotted for each bacterium. For each box and whisker plot, only the samples including the bacterium were included.
Figure 3
Figure 3
Interaction/blending plots for L. crispatus and (a) G. vaginalis and (b) S. amnii . The contours indicate the expected observed amount of L. crispatus for a given actual percentage of a sample for a pair of bacteria.
Figure 4
Figure 4
Results for mixture of L. crispatus and S. agalactiae . Actual and observed proportions of bacteria when mixing equal proportions of cells (Exp. 1), DNA (Exp. 2), and PCR product (Exp. 3) for L. crispatus and S. agalactiae.
Figure 5
Figure 5
(a) Observed and (b) predicted proportions of bacteria of four clinical samples. The samples are from the same subject in different visits.

References

    1. Lagier J-C, Million M, Hugon P, Armougom F, Raoult D. Human gut microbiota: Repertoire and variations. Front Cell Infect Microbiol. 2012;2:136. doi: 10.3389/fcimb.2012.00136. - DOI - PMC - PubMed
    1. Knight R, Jansson J, Field D, Fierer N, Desai N, Fuhrman JA, Unlocking the potential of metagenomics through replicated experimental design. Nat Biotech. 2012;30(6):513–20. doi: 10.1038/nbt.2235. - DOI - PMC - PubMed
    1. The Microbiome Quality Control Project (MBQC). [http://www.mbqc.org]
    1. Pinto AJ, Raskin L. PCR biases distort bacterial and archaeal community structure in pyrosequencing datasets. PLoS ONE. 2012;7:43093. doi: 10.1371/journal.pone.0043093. - DOI - PMC - PubMed
    1. Hong SH, Bunge J, Leslin C, Jeon S, Epstein SS. Polymerase chain reaction primers miss half of rRNA microbial diversity. ISME J. 2009;3:1365–73. doi: 10.1038/ismej.2009.89. - DOI - PubMed

Publication types

MeSH terms