Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Dec 18;11(12):e1004557.
doi: 10.1371/journal.pcbi.1004557. eCollection 2015 Dec.

High-Specificity Targeted Functional Profiling in Microbial Communities with ShortBRED

Affiliations

High-Specificity Targeted Functional Profiling in Microbial Communities with ShortBRED

James Kaminski et al. PLoS Comput Biol. .

Abstract

Profiling microbial community function from metagenomic sequencing data remains a computationally challenging problem. Mapping millions of DNA reads from such samples to reference protein databases requires long run-times, and short read lengths can result in spurious hits to unrelated proteins (loss of specificity). We developed ShortBRED (Short, Better Representative Extract Dataset) to address these challenges, facilitating fast, accurate functional profiling of metagenomic samples. ShortBRED consists of two components: (i) a method that reduces reference proteins of interest to short, highly representative amino acid sequences ("markers") and (ii) a search step that maps reads to these markers to quantify the relative abundance of their associated proteins. After evaluating ShortBRED on synthetic data, we applied it to profile antibiotic resistance protein families in the gut microbiomes of individuals from the United States, China, Malawi, and Venezuela. Our results support antibiotic resistance as a core function in the human gut microbiome, with tetracycline-resistant ribosomal protection proteins and Class A beta-lactamases being the most widely distributed resistance mechanisms worldwide. ShortBRED markers are applicable to other homology-based search tasks, which we demonstrate here by identifying phylogenetic signatures of antibiotic resistance across more than 3,000 microbial isolate genomes. ShortBRED can be applied to profile a wide variety of protein families of interest; the software, source code, and documentation are available for download at http://huttenhower.sph.harvard.edu/shortbred.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. The ShortBRED algorithm.
ShortBRED-Identify creates distinctive markers for protein families of interest. ShortBRED-Quantify maps nucleotides reads to markers and normalizes abundance.
Fig 2
Fig 2. Accuracy of ShortBRED and centroid-based profiling within synthetic metagenomes.
(A, B) ROC curves report the sensitivity and specificity (in terms of TPR and FPR) of the two methods for correctly identifying the presence and absence of protein families of interest in six synthetic metagenomes, spiked with 5%, 10%, and 25% of their material from the ARDB (panel A) and VFDB (panel B). (C, D) Scatterplots of protein family “predicted from mapping”, the abundance values calculated by ShortBRED and the centroids, versus “expected from gold standard”, the abundance values of the protein families in the 10% synthetic metagenome.
Fig 3
Fig 3. Speed of execution: ShortBRED versus centroid-based profiling.
Results are based on time used by USEARCH in ShortBRED-Quantify.
Fig 4
Fig 4. Antibiotic resistance in the human gut microbiome.
RPKM values produced by ShortBRED for antibiotic resistance protein families, summed by class of resistance. Samples in the USA-Global, Venezuela, and Malawi cohorts were profiled by mapping reads to centroids due to their lower sequencing depth. Marker information is listed in Table 2. Samples (columns) were clustered according to Canberra distance and antibiotic resistance families (rows) were clustered according to Euclidean distance.
Fig 5
Fig 5. Prevalence of antibiotic resistance across bacterial isolate genomes.
Phylogenetic tree of bacterial genomes from IMG [24] overlaid with presence/absence of ShortBRED antibiotic resistance protein families. The outermost ring indicates the share of genes in each species’ genome that mapped to any of the AR protein families. This figure was produced using GraPhlAn [27].

References

    1. Smillie CS, Smith MB, Friedman J, Cordero OX, David LA, Alm EJ. Ecology drives a global network of gene exchange connecting the human microbiome. Nature. 2011;480(7376):241–4. 10.1038/nature10571 - DOI - PubMed
    1. Abubucker S, Segata N, Goll J, Schubert AM, Izard J, Cantarel BL, et al. Metabolic reconstruction for metagenomic data and its application to the human microbiome. PLoS Comput Biol. 2012;8(6):e1002358 10.1371/journal.pcbi.1002358 - DOI - PMC - PubMed
    1. Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, Manichanh C, et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature. 2010;464(7285):59–65. 10.1038/nature08821 - DOI - PMC - PubMed
    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10. - PubMed
    1. Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010;26(19):2460–1. 10.1093/bioinformatics/btq461 - DOI - PubMed

Publication types