Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov:154:6-13.
doi: 10.1016/j.mimet.2018.09.019. Epub 2018 Sep 29.

A multi-amplicon 16S rRNA sequencing and analysis method for improved taxonomic profiling of bacterial communities

Affiliations

A multi-amplicon 16S rRNA sequencing and analysis method for improved taxonomic profiling of bacterial communities

Andrew E Schriefer et al. J Microbiol Methods. 2018 Nov.

Abstract

Metagenomic sequencing of bacterial samples has become the gold standard for profiling microbial populations, but 16S rRNA profiling remains widely used due to advantages in sample throughput, cost, and sensitivity even though the approach is hampered by primer bias and lack of specificity. We hypothesized that a hybrid approach, that combined targeted PCR amplification with high-throughput sequencing of multiple regions of the genome, would capture many of the advantages of both approaches. We developed a method that identifies and quantifies members of bacterial communities through simultaneous analysis of multiple variable regions of the bacterial 16S rRNA gene. The method combines high-throughput microfluidics for PCR amplification, short read DNA sequencing, and a custom algorithm named MVRSION (Multiple 16S Variable Region Species-Level IdentificatiON) for optimizing taxonomic assignment. MVRSION performance was compared to single variable region analyses (V3 or V4) of five synthetic mixtures of human gut bacterial strains using existing software (QIIME), and the results of community profiling by shotgun sequencing (COPRO-Seq) of fecal DNA samples collected from gnotobiotic mice colonized with a defined, phylogenetically diverse consortium of human gut bacterial strains. Positive predictive values for MVRSION ranged from 65%-91% versus 44%-61% for single region QIIME analyses (p < .01, p < .001), while the abundance estimate r2 for MVRSION compared to COPRO-Seq was 0.77 vs. 0.46 and 0.45 for V3-QIIME and V4-QIIME, respectively. MVRSION represents a generally applicable tool for taxonomic classification that is superior to single-region 16S rRNA methods, resource efficient, highly scalable for assessing the microbial composition of up to thousands of samples concurrently, with multiple applications ranging from whole community profiling to targeted tracking of organisms of interest in diverse habitats as a function of specified variables/perturbations.

Keywords: 16S rRNA gene; Microbial community analysis; Microbial diversity; Next generation sequencing.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Overview of the MVRSION method for identifying microbial species.
There are two major components to the MVRSION algorithm. (a) Two MVRSION databases are compiled for use with all sequencing datasets. A curated database contains accurately annotated (bacterial species-level), full length 16S rRNA sequences from public sources (SILVA). Using this curated 16S rRNA database and known amplicon primer sequences, in silico PCR predicts a list of all amplicon sequences across the nine 16S rRNA variable regions of all bacterial species in the curated database. These databases are subsequently used for processing all input datasets. (b) For sample processing, amplicon sequencing reads are mapped to the 16S rRNA sequences in the curated database. A list of candidate species is generated from all species with reads mapping to four or more variable regions. For each candidate species, the predicted amplicons are compared to all other candidate species. From the predicted amplicon comparison, variable region(s) with sequences most unique to that candidate species versus all other candidates are selected as its “discriminatory variable regions”. In parallel, the original input reads are re-aligned to just the candidate species, as all other species from the curated 16S rRNA database have been eliminated from consideration. If a requisite number of reads have been mapped to a candidate species discriminatory variable region(s) from this realignment, the species is called present. For all species called present, the abundance is estimated as described in Section 5.5.5.
Figure 2
Figure 2. MVRSION and single 16S rRNA variable region QIIME comparative analyses.
(a-d) Species-level assessments of the comparative measures True Positives (TP), False Positives (FP), and False Negatives (FN) as computed for the three analytical methods (MVRSION, V3-QIIME, and V4- QIIME) utilizing the synthetic mixtures (HM-782D, 48G-Eq, combined 48G-Stg1–3), and 92 fecal samples from gnotobiotic mice, respectively. (e-h) Calculated Positive Predictive Values (PPV) and Sensitivity (Sens) for the three analytical methods and samples. Statistical comparisons are significantly improved for MVRSION compared to single variable region analyses with QIIME, **, P<0.01; ***, P<0.001; ****, P<0.0001 (two-tailed unpaired t-test, equal variance). (i-k) MVRSION, V3-QIIME, and V4-QIIME estimated relative abundance for all taxa identified in all samples, compared to their relative abundance based on known DNA concentrations in synthetic mixtures or COPRO-Seq analysis of DNAs prepared from fecal samples obtained from gnotobiotic mice harboring a defined model human gut microbiota. While all correlations are significant (P<0.0001), MVRSION demonstrates a markedly higher r2 value (0.77) compared to V3-QIIME (0.46) and V4-QIIME (0.45). (l) V4-QIIME analysis was run at multiple levels of abundance filtering, as described in Section 5.6, to illustrate the optimization of MVRSION for both sensitivity and specificity.

References

    1. Allard G, Ryan FJ, Jeffery IB, & Claesson MJ (2015). SPINGO: a rapid species-classifier for microbial amplicon sequences. BMC Bioinformatics, 16, 324. doi: 10.1186/s12859-015-0747-1 - DOI - PMC - PubMed
    1. Aronesty E (2011). ea-utils : Command-line tools for processing biological sequencing data. Github repository. Retrieved from https://github.com/ExpressionAnalysis/ea-utils
    1. Buffalo V (2011). Scythe - A Bayesian adapter trimmer (Version 0.994). Github repository Retrieved from https://github.com/vsbuffalo/scythe
    1. Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, . . . Knight R (2010). QIIME allows analysis of high-throughput community sequencing data. Nat Methods, 7(5), 335–336. doi: 10.1038/nmeth.f.303 - DOI - PMC - PubMed
    1. Edgar RC (2010). Search and clustering orders of magnitude faster than BLAST. Bioinformatics, 26(19), 2460–2461. doi: 10.1093/bioinformatics/btq461 - DOI - PubMed

Publication types