Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jan 12;11(1):e0146423.
doi: 10.1371/journal.pone.0146423. eCollection 2016.

Functional Profiling of Unfamiliar Microbial Communities Using a Validated De Novo Assembly Metatranscriptome Pipeline

Affiliations

Functional Profiling of Unfamiliar Microbial Communities Using a Validated De Novo Assembly Metatranscriptome Pipeline

Mark Davids et al. PLoS One. .

Abstract

Background: Metatranscriptomic landscapes can provide insights in functional relationships within natural microbial communities. Analysis of complex metatranscriptome datasets of these communities poses a considerable bioinformatic challenge since they are non-restricted with a varying number of participating strains and species. For RNA-Seq data a standard approach is to align the generated reads to a set of closely related reference genomes. This only works well for microbial communities for which a near complete catalogue of reference genomes is available at a small evolutionary distance. In this study, we focus on the design of a validated de novo metatranscriptome assembly pipeline for single-end Illumina RNA-Seq data to obtain functional and taxonomic profiles of murine microbial communities.

Results: The here developed de novo assembly metatranscriptome pipeline combined rRNA removal, IDBA-UD assembler, functional annotation and taxonomic classification. Different assemblers were tested and validated using RNA-Seq data from an in silico generated mock community and in vivo RNA-Seq data from a restricted microbial community taken from a mouse model colonized with Altered Schaedler Flora (ASF). Precision and recall of resulting gene expression, functional and taxonomic profiles were compared to those obtained with a standard alignment method. The validated pipeline was subsequently used to generate expression profiles from non-restricted cecal communities of four C57BL/6J mice fed on a high-fat high-protein diet spiked with an RNA-Seq data set from a well-characterized human sample. The spike in control was used to estimate precision and recall at assembly, functional and taxonomic level of non-restricted communities.

Conclusions: A generic de novo assembly pipeline for metatranscriptome data analysis was designed for microbial ecosystems, which can be applied for microbial metatranscriptome analysis in any chosen niche.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Metatranscriptome analysis workflow.
Details of the programs used are described in the methods section.
Fig 2
Fig 2. Comparison of functional profiles of an eight species mock community metatranscriptome.
Reads assigned via direct genome alignment method (x-axis) and de novo assembly with IDBA-UD (y-axis). Each dot represents a specific KEGG orthologous function.
Fig 3
Fig 3. Comparison of functional and taxonomic profiles of the Altered Schaedler Flora from the intestine of a NOD mouse model [13].
A) Alignment vs assembly functional profiling; x-axis, direct genome alignment; y-axis, de novo assembly. Taxonomic profiles of mRNA reads obtained by direct genome mapping (B) and by using the de novo assembly method (C). Sample labels were taken from Xiong et al.
Fig 4
Fig 4. Similarity score distributions of predicted Mouse and Human microbial community proteins to known proteins.
Translated proteins were aligned to the NCBI nr protein database and binned according to their SRV score. The SRV score represents the bit-score of the best hit divided by the maximum obtainable bit-score [61].
Fig 5
Fig 5. Taxonomic composition of the transcriptome using three different methods.
Reads for three samples were assigned to family level using de novo assembly, blastx and megablast.
Fig 6
Fig 6. Distribution of COG functional categories of the mouse cecal metatranscriptome.
Fig 7
Fig 7. Metabolic pathways mapping of Lachnospiraceae and Erysipelotrichaceae expression profiles.
Relative contribution of each family (green Lachnospiraceae, red Erysipelotrichaceae) are color scaled. Line-width indicates the total amount of reads mapped to the corresponding KEGG ortholog (log scaled).

References

    1. Li J, Jia H, Cai X, Zhong H, Feng Q, Sunagawa S, et al. An integrated catalog of reference genes in the human gut microbiome. Nat Biotechnol. 2014;32: 834–841. 10.1038/nbt.2942 - DOI - PubMed
    1. Lozupone C, Faust K, Raes J, Faith JJ, Frank DN, Zaneveld J, et al. Identifying genomic and metabolic features that can underlie early successional and opportunistic lifestyles of human gut symbionts. Genome Res. 2012;22: 1974–1984. 10.1101/gr.138198.112 - DOI - PMC - PubMed
    1. Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, Manichanh C, et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature. 2010;464: 59–65. 10.1038/nature08821 - DOI - PMC - PubMed
    1. Wu GD, Chen J, Hoffmann C, Bittinger K, Chen Y-Y, Keilbaugh SA, et al. Linking Long-Term Dietary Patterns with Gut Microbial Enterotypes. Science. 2011;334: 105–108. 10.1126/science.1208344 - DOI - PMC - PubMed
    1. Arumugam M, Raes J, Pelletier E, Le Paslier D, Yamada T, Mende DR, et al. Enterotypes of the human gut microbiome. Nature. 2011;473: 174–180. 10.1038/nature09944 - DOI - PMC - PubMed

Publication types

Substances

LinkOut - more resources