Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jul 12:10:653.
doi: 10.3389/fgene.2019.00653. eCollection 2019.

Can Targeting Non-Contiguous V-Regions With Paired-End Sequencing Improve 16S rRNA-Based Taxonomic Resolution of Microbiomes?: An In Silico Evaluation

Affiliations

Can Targeting Non-Contiguous V-Regions With Paired-End Sequencing Improve 16S rRNA-Based Taxonomic Resolution of Microbiomes?: An In Silico Evaluation

Nishal Kumar Pinna et al. Front Genet. .

Abstract

Background: Next-generation sequencing (NGS) technologies have enabled probing of microbial diversity in different environmental niches with unprecedented sequencing depth. However, due to read-length limitations of popular NGS technologies, 16S amplicon sequencing-based microbiome studies rely on targeting short stretches of the 16S rRNA gene encompassing a selection of variable (V) regions. In most cases, such a short stretch constitutes a single V-region or a couple of V-regions placed adjacent to each other on the 16S rRNA gene. Given that different V-regions have different resolving ability with respect to various taxonomic groups, selecting the optimal V-region (or a combination thereof) remains a challenge. Methods: The accuracy of taxonomic profiles generated from sequences encompassing 1) individual V-regions, 2) adjacent V-regions, and 3) pairs of non-contiguous V-regions were assessed and compared. Subsequently, the discriminating capability of different V-regions with respect to different taxonomic lineages was assessed. The possibility of using paired-end sequencing protocols to target combinations of non-adjacent V-regions was finally evaluated with respect to the utility of such an experimental design in providing improved taxonomic resolution. Results: Extensive validation with simulated microbiome datasets mimicking different environmental and host-associated microbiome samples suggest that targeting certain combinations of non-contiguously placed V-regions might yield better taxonomic classification accuracy compared to conventional 16S amplicon sequencing targets. This work also puts forward a novel in silico combinatorial strategy that enables creation of consensus taxonomic profiles from experiments targeting multiple pair-wise combinations of V-regions to improve accuracy in taxonomic classification. Conclusion: The study suggests that targeting non-contiguous V-regions with paired-end sequencing can improve 16S rRNA-based taxonomic resolution of microbiomes. Furthermore, employing the novel in silico combinatorial strategy can improve taxonomic classification without any significant additional experimental costs and/or efforts. The empirical observations obtained can potentially serve as a guideline for future 16S microbiome studies, and facilitate researchers in choosing the optimal combination of V-regions for a specific experiment/sampled environment.

Keywords: amplicon sequencing; metagenomics 16S; microbiome analysis; paired-end sequencing; taxonomic profiling.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Combinatorial strategy for targeting multiple pair-wise combinations of non-contiguous (or contiguous) V-regions. The strategy relies on obtaining taxonomic abundance profiles of a microbial community from two paired-end sequencing experiments, each of which targets different pair-wise combinations of V-regions. The two taxonomic profiles are then combined based on the pre-calculated accuracies of individual V-regions (targeted in the two experiments) in resolving each of the taxonomic groups under consideration.
Figure 2
Figure 2
Taxonomic classification accuracies at genus level for different variable regions. Plot depicting the percentage of 16S rRNA genes present in RDP database that could be correctly classified utilizing different variable (V) regions (see Methods). Correct classifications obtained using full-length 16S sequences are also depicted for comparison. Taxonomic classification accuracy at genus level has been considered in this plot and has been cumulated and depicted at the phylum level (only for five most represented phyla in the downloaded RDP sequences).
Figure 3
Figure 3
Taxonomic classification accuracies at species level for different variable regions. Plot depicting the average taxonomic classification accuracies obtained at species level using different pair-wise combinations of V-regions (both contiguous as well as non-contiguous) drawn from the 16S rRNA genes. 16S rRNA genes used for the evaluation were retrieved from the RDP database (see Methods).
Figure 4
Figure 4
Taxonomic classification accuracies obtained using different pair-wise combinations of V-regions (contiguous as well as non-contiguous). Accuracy of taxonomic assignments has been evaluated at the species level and cumulated at phylum level for representation (only for five most represented phyla in the downloaded RDP sequences). Combinations of V-regions achieving a classification accuracy of > = 70% (averaged for the depicted phyla) are shown. Combinations of contiguously placed V-regions have been indicated with an asterisk (*).
Figure 5
Figure 5
Evaluation of taxonomic classification efficiency on simulated microbiomes. Taxonomic classification efficiency of different combinations of V-regions evaluated on nine simulated microbiome datasets mimicking different environmental niches. Taxonomic classification accuracy in terms of percentages of correct assignments at species level are indicated in the heatmap. The color scale (1–36) depicts the performance rank of different combinations of V-regions (total of 36 combinations) in terms of taxonomic classification accuracy for each of the simulated microbiomes (presented in columns).

References

    1. Alekseyenko A. V., Perez-Perez G. I., De Souza A., Strober B., Gao Z., Bihan M., et al. (2013). Community differentiation of the cutaneous microbiota in psoriasis. Microbiome 1, 31. 10.1186/2049-2618-1-31 - DOI - PMC - PubMed
    1. Amir A., McDonald D., Navas-Molina J. A., Kopylova E., Morton J. T., Zech Xu Z., et al. (2017). Deblur rapidly resolves single-nucleotide community sequence patterns. mSystems 2, e00191–16. 10.1128/mSystems.00191-16 - DOI - PMC - PubMed
    1. Bartram A. K., Lynch M. D. J., Stearns J. C., Moreno-Hagelsieb G., Neufeld J. D. (2011). Generation of multimillion-sequence 16S rRNA gene libraries from complex microbial communities by assembling paired-end illumina reads. Appl. Environ. Microbiol. 77, 3846–3852. 10.1128/AEM.02772-10 - DOI - PMC - PubMed
    1. Botero L. E., Delgado-Serrano L., Cepeda M. L., Bustos J. R., Anzola J. M., Del Portillo P., et al. (2014). Respiratory tract clinical sample selection for microbiota analysis in patients with pulmonary tuberculosis. Microbiome 2, 29. 10.1186/2049-2618-2-29 - DOI - PMC - PubMed
    1. Callahan B. J., McMurdie P. J., Rosen M. J., Han A. W., Johnson A. J. A., Holmes S. P. (2016). DADA2: High-resolution sample inference from Illumina amplicon data. Nat. Methods 13, 581–583. 10.1038/nmeth.3869 - DOI - PMC - PubMed

LinkOut - more resources