Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jul 17;8(9):761-3.
doi: 10.1038/nmeth.1650.

Bayesian community-wide culture-independent microbial source tracking

Affiliations

Bayesian community-wide culture-independent microbial source tracking

Dan Knights et al. Nat Methods. .

Abstract

Contamination is a critical issue in high-throughput metagenomic studies, yet progress toward a comprehensive solution has been limited. We present SourceTracker, a Bayesian approach to estimate the proportion of contaminants in a given community that come from possible source environments. We applied SourceTracker to microbial surveys from neonatal intensive care units (NICUs), offices and molecular biology laboratories, and provide a database of known contaminants for future testing.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Comparison of SourceTracker and alternative models
Three models were used to estimate the proportions of two source environments in a set of simulated samples, as the degree of overlap between the environments was varied from a Jensen-Shannon divergence (JSD) of 0 (completely identical, and thus impossible to disambiguate), to a JSD of 1 (completely non-overlapping, and thus trivial to disambiguate). The coefficients of determination (R2) of the estimated proportions are plotted. Each point represents the mean R2 for three trials of 100 samples each; error bars show s.e.m. (n = 3).
Figure 2
Figure 2. SourceTracker proportion estimates for a subset of sink samples
Source environment proportions were estimated using SourceTracker and 45 training samples from each source environment. (a) Pie charts of the mean proportions for 100 draws from Gibbs sampling. (b) Bar charts for three samples including standard deviations of the proportion estimates. (c) Direct visualization of 100 Gibbs draws for the samples in (b); each column shows the mixture from one draw, with columns sorted by the most prevalent source. The first sample, Lab 1: PCR water 1, shows several possible mixtures: all Unknown; Gut and Skin (most common); and Gut and Soil. The second sample shows poor disambiguation between Gut, Skin, and Unknown. Most mixtures were stable like the third sample; the first two were chosen for demonstrative purposes.
Figure 3
Figure 3. Relative abundance of common contaminating operational taxonomic units (OTUs)
For all sink sequences assigned to a known source environment (Gut, Oral, Skin, or Soil) by SourceTracker, these ten OTUs had the highest average relative abundance across sink environments. Note that the OTU classified as Enterobacter, a lineage commonly seen in the gut, was more prevalent in the Skin training samples than the Gut training samples.

References

    1. Acinas SG, Sarma-Rupavtarm R, Klepac-Ceraj V, Polz MF. Appl. Environ. Microbiol. 2005;71(12):8966–8969. - PMC - PubMed
    1. Quince C, et al. Nat. Methods. 2009;6(9):639–641. - PubMed
    1. Tanner MA, Goebel BM, Dojka MA, Pace NR. Appl. Environ. Microbiol. 1998;64(8):3110–3113. - PMC - PubMed
    1. Simpson JM, Santo Domingo JW, Reasoner DJ. Environ Sci Technol. 2002;36(24):5279–5288. - PubMed
    1. Wu CH, et al. PloS one. 2010;5(6):e11285. - PMC - PubMed

Publication types