Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Dec 15;9(12):e114804.
doi: 10.1371/journal.pone.0114804. eCollection 2014.

IM-TORNADO: a tool for comparison of 16S reads from paired-end libraries

Affiliations

IM-TORNADO: a tool for comparison of 16S reads from paired-end libraries

Patricio Jeraldo et al. PLoS One. .

Abstract

Motivation: 16S rDNA hypervariable tag sequencing has become the de facto method for accessing microbial diversity. Illumina paired-end sequencing, which produces two separate reads for each DNA fragment, has become the platform of choice for this application. However, when the two reads do not overlap, existing computational pipelines analyze data from read separately and underutilize the information contained in the paired-end reads.

Results: We created a workflow known as Illinois Mayo Taxon Organization from RNA Dataset Operations (IM-TORNADO) for processing non-overlapping reads while retaining maximal information content. Using synthetic mock datasets, we show that the use of both reads produced answers with greater correlation to those from full length 16S rDNA when looking at taxonomy, phylogeny, and beta-diversity.

Availability and implementation: IM-TORNADO is freely available at http://sourceforge.net/projects/imtornado and produces BIOM format output for cross compatibility with other pipelines such as QIIME, mothur, and phyloseq.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors would also note here that Prof. Bryan White is member of the Editorial Board of PLOS ONE. This does not alter the authors' adherence to PLOS ONE Editorial policies and criteria.

Figures

Figure 1
Figure 1. Comparison of alignment tools.
Plot of Gamma Log Likelihoods of 100 trees created from paired reads selected from the Greengenes 135 database, and aligned using PyNAST and Infernal version 1.1. Likelihoods in the trees created using Infernal are significantly better than the trees created with PyNAST (p<0.0001, Wilcoxon signed ranked test), strongly suggesting that Infernal produces better quality alignments than PyNAST for the same input reads.
Figure 2
Figure 2. Comparison of phylogenetic trees between libraries.
Plot of a Mantel correlation test comparing cophenetic distance matrices calculated from phylogenetic trees created using paired, R1 and R2 (for both the V3–V5 and V6–V9 primer pairs) versus the distance matrix created from the corresponding full-length 16S trees. A higher correlation value means the trees are more closely related to the full-length trees. Here, the paired trees are significantly closer to the full-length trees than the R1 and R2 trees (p<0.0001, Wilcoxon signed ranked test, using 100 synthetic mock communities), strongly suggesting that combining the use of paired reads leads to phylogenies closer to what is obtained from full-length reads, even when the chosen primers create non-overlapping reads.
Figure 3
Figure 3. Comparison of β-diversity between libraries.
Plot of a Mantel correlation test comparing unweighted UniFrac distance matrices created using synthetic mock communities from paired, R1 and R2 reads (for both the V3–V5 and V6–V9 pairs) versus the distance matrix created from the corresponding full-length 16S synthetic mock communities. A higher correlation value means the distance matrices, and hence their β-diversity, are more closely related to the full-length communities. Here, the communities from paired reads are significantly closer to the full-length communities than the R1 and R2 communities (p<0.0001, Wilcoxon signed ranked test, using 100 synthetic mock communities), strongly suggesting that combining the use of paired reads leads to results closer to what is obtained from full-length reads, even when the chosen primers create non-overlapping reads.
Figure 4
Figure 4. IM-TORNADO pipeline workflow.
Schematic of the IM-TORNADO pipeline workflow.

References

    1. Lane DJ, Pace B, Olsen GJ, Stahl DA, Sogin ML, et al. (1985) Rapid determination of 16S ribosomal RNA sequences for phylogenetic analyses. Proc Natl Acad Sci USA 82:6955–6959. - PMC - PubMed
    1. Gevers D, Knight R, Petrosino JF, Huang K, McGuire AL, et al. (2012) The Human Microbiome Project: A Community Resource for the Healthy Human Microbiome. PLOS Biol 10:e1001377. - PMC - PubMed
    1. Fierer N, Lauber CL, Zhou N, McDonald D, Costello EK, et al. (2010) Forensic identification using skin bacterial communities. Proc Natl Acad Sci USA 107:6477–6481. - PMC - PubMed
    1. Di Rienzi SC, Sharon I, Wrighton KC, Koren O, Hug LA, et al. (2013) The human gut and groundwater harbor non-photosynthetic bacteria belonging to a new candidate phylum sibling to Cyanobacteria. eLife 2:e01102. - PMC - PubMed
    1. Rinke C, Schwientek P, Sczyrba A, Ivanova NN, Anderson IJ, et al. (2013) Insights into the phylogeny and coding potential of microbial dark matter. Nature 499:431–437. - PubMed

Publication types

Substances