. 2014 Dec 15;9(12):e114804.

doi: 10.1371/journal.pone.0114804. eCollection 2014.

IM-TORNADO: a tool for comparison of 16S reads from paired-end libraries

Patricio Jeraldo¹, Krishna Kalari², Xianfeng Chen², Jaysheel Bhavsar², Ashutosh Mangalam³, Bryan White⁴, Heidi Nelson⁵, Jean-Pierre Kocher², Nicholas Chia⁶

Affiliations

¹ Department of Surgery, Mayo Clinic, Rochester, Minnesota, United States of America; Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America.
² Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, United States of America.
³ Department of Immunology, Mayo Clinic, Rochester, Minnesota, United States of America.
⁴ Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America; Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America.
⁵ Department of Surgery, Mayo Clinic, Rochester, Minnesota, United States of America.
⁶ Department of Surgery, Mayo Clinic, Rochester, Minnesota, United States of America; Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America; Department of Physiology and Biomedical Engineering, Mayo Clinic College of Medicine, Rochester, Minnesota, United States of America.

PMID: 25506826
PMCID: PMC4266640
DOI: 10.1371/journal.pone.0114804

IM-TORNADO: a tool for comparison of 16S reads from paired-end libraries

Patricio Jeraldo et al. PLoS One. 2014.

. 2014 Dec 15;9(12):e114804.

doi: 10.1371/journal.pone.0114804. eCollection 2014.

Authors

Patricio Jeraldo¹, Krishna Kalari², Xianfeng Chen², Jaysheel Bhavsar², Ashutosh Mangalam³, Bryan White⁴, Heidi Nelson⁵, Jean-Pierre Kocher², Nicholas Chia⁶

Affiliations

¹ Department of Surgery, Mayo Clinic, Rochester, Minnesota, United States of America; Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America.
² Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, United States of America.
³ Department of Immunology, Mayo Clinic, Rochester, Minnesota, United States of America.
⁴ Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America; Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America.
⁵ Department of Surgery, Mayo Clinic, Rochester, Minnesota, United States of America.
⁶ Department of Surgery, Mayo Clinic, Rochester, Minnesota, United States of America; Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America; Department of Physiology and Biomedical Engineering, Mayo Clinic College of Medicine, Rochester, Minnesota, United States of America.

PMID: 25506826
PMCID: PMC4266640
DOI: 10.1371/journal.pone.0114804

Abstract

Motivation: 16S rDNA hypervariable tag sequencing has become the de facto method for accessing microbial diversity. Illumina paired-end sequencing, which produces two separate reads for each DNA fragment, has become the platform of choice for this application. However, when the two reads do not overlap, existing computational pipelines analyze data from read separately and underutilize the information contained in the paired-end reads.

Results: We created a workflow known as Illinois Mayo Taxon Organization from RNA Dataset Operations (IM-TORNADO) for processing non-overlapping reads while retaining maximal information content. Using synthetic mock datasets, we show that the use of both reads produced answers with greater correlation to those from full length 16S rDNA when looking at taxonomy, phylogeny, and beta-diversity.

Availability and implementation: IM-TORNADO is freely available at http://sourceforge.net/projects/imtornado and produces BIOM format output for cross compatibility with other pipelines such as QIIME, mothur, and phyloseq.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors would also note here that Prof. Bryan White is member of the Editorial Board of PLOS ONE. This does not alter the authors' adherence to PLOS ONE Editorial policies and criteria.

Figures

**Figure 1. Comparison of alignment tools.**
Plot of Gamma Log Likelihoods of 100 trees created from paired reads selected from the Greengenes 13_–5 database, and aligned using PyNAST and Infernal version 1.1. Likelihoods in the trees created using Infernal are significantly better than the trees created with PyNAST (p<0.0001, Wilcoxon signed ranked test), strongly suggesting that Infernal produces better quality alignments than PyNAST for the same input reads.

**Figure 2. Comparison of phylogenetic trees between libraries.**
Plot of a Mantel correlation test comparing cophenetic distance matrices calculated from phylogenetic trees created using paired, R1 and R2 (for both the V3–V5 and V6–V9 primer pairs) versus the distance matrix created from the corresponding full-length 16S trees. A higher correlation value means the trees are more closely related to the full-length trees. Here, the paired trees are significantly closer to the full-length trees than the R1 and R2 trees (p<0.0001, Wilcoxon signed ranked test, using 100 synthetic mock communities), strongly suggesting that combining the use of paired reads leads to phylogenies closer to what is obtained from full-length reads, even when the chosen primers create non-overlapping reads.

**Figure 3. Comparison of β-diversity between libraries.**
Plot of a Mantel correlation test comparing unweighted UniFrac distance matrices created using synthetic mock communities from paired, R1 and R2 reads (for both the V3–V5 and V6–V9 pairs) versus the distance matrix created from the corresponding full-length 16S synthetic mock communities. A higher correlation value means the distance matrices, and hence their β-diversity, are more closely related to the full-length communities. Here, the communities from paired reads are significantly closer to the full-length communities than the R1 and R2 communities (p<0.0001, Wilcoxon signed ranked test, using 100 synthetic mock communities), strongly suggesting that combining the use of paired reads leads to results closer to what is obtained from full-length reads, even when the chosen primers create non-overlapping reads.

**Figure 4. IM-TORNADO pipeline workflow.**
Schematic of the IM-TORNADO pipeline workflow.

See this image and copyright information in PMC

References

1. Lane DJ, Pace B, Olsen GJ, Stahl DA, Sogin ML, et al. (1985) Rapid determination of 16S ribosomal RNA sequences for phylogenetic analyses. Proc Natl Acad Sci USA 82:6955–6959. - PMC - PubMed
1. Gevers D, Knight R, Petrosino JF, Huang K, McGuire AL, et al. (2012) The Human Microbiome Project: A Community Resource for the Healthy Human Microbiome. PLOS Biol 10:e1001377. - PMC - PubMed
1. Fierer N, Lauber CL, Zhou N, McDonald D, Costello EK, et al. (2010) Forensic identification using skin bacterial communities. Proc Natl Acad Sci USA 107:6477–6481. - PMC - PubMed
1. Di Rienzi SC, Sharon I, Wrighton KC, Koren O, Hug LA, et al. (2013) The human gut and groundwater harbor non-photosynthetic bacteria belonging to a new candidate phylum sibling to Cyanobacteria. eLife 2:e01102. - PMC - PubMed
1. Rinke C, Schwientek P, Sczyrba A, Ivanova NN, Anderson IJ, et al. (2013) Insights into the phylogeny and coding potential of microbial dark matter. Nature 499:431–437. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

R01 CA179243/CA/NCI NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

IM-TORNADO: a tool for comparison of 16S reads from paired-end libraries

Affiliations

IM-TORNADO: a tool for comparison of 16S reads from paired-end libraries

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources