Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Aug 1;8(8):giz096.
doi: 10.1093/gigascience/giz096.

To assemble or not to resemble-A validated Comparative Metatranscriptomics Workflow (CoMW)

Affiliations

To assemble or not to resemble-A validated Comparative Metatranscriptomics Workflow (CoMW)

Muhammad Zohaib Anwar et al. Gigascience. .

Abstract

Background: Metatranscriptomics has been used widely for investigation and quantification of microbial communities' activity in response to external stimuli. By assessing the genes expressed, metatranscriptomics provides an understanding of the interactions between different major functional guilds and the environment. Here, we present a de novo assembly-based Comparative Metatranscriptomics Workflow (CoMW) implemented in a modular, reproducible structure. Metatranscriptomics typically uses short sequence reads, which can either be directly aligned to external reference databases ("assembly-free approach") or first assembled into contigs before alignment ("assembly-based approach"). We also compare CoMW (assembly-based implementation) with an assembly-free alternative workflow, using simulated and real-world metatranscriptomes from Arctic and temperate terrestrial environments. We evaluate their accuracy in precision and recall using generic and specialized hierarchical protein databases.

Results: CoMW provided significantly fewer false-positive results, resulting in more precise identification and quantification of functional genes in metatranscriptomes. Using the comprehensive database M5nr, the assembly-based approach identified genes with only 0.6% false-positive results at thresholds ranging from inclusive to stringent compared with the assembly-free approach, which yielded up to 15% false-positive results. Using specialized databases (carbohydrate-active enzyme and nitrogen cycle), the assembly-based approach identified and quantified genes with 3-5 times fewer false-positive results. We also evaluated the impact of both approaches on real-world datasets.

Conclusions: We present an open source de novo assembly-based CoMW. Our benchmarking findings support assembling short reads into contigs before alignment to a reference database because this provides higher precision and minimizes false-positive results.

Keywords: alignment; assembly; benchmarking; false-positive results; metatranscriptomics; precision; recall.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Flow chart illustrating the evaluation and benchmarking scheme used for the comparison of alternative approaches. Red path indicates the full-length genes workflow, green indicates the steps in the assembly-based workflow CoMW, and blue indicates the steps in the assembly-free approach.
Figure 2:
Figure 2:
Differential expression comparison of the assembly-free and the CoMW assembly-based approaches using (A) eggNOG database, (B) CAZy, and (C) NCycDB database.
Figure 3:
Figure 3:
Relative abundance of eggNOG functional subsystems in Arctic permafrost soil identified and quantified using both CoMW and the assembly-free approach compares the differences in observed functional dynamics. Blue dotted line represents trends using CoMW (assembly-based) whereas red solid line represents the assembly-free approach.
Figure 4:
Figure 4:
Relative abundance of eggNOG functional subsystems in ash-deposited Danish forest soil with time identified using both the CoMW and an assembly-free approach. Blue dotted line represents trends using CoMW (assembly-based) whereas red solid line represents the assembly-free approach.

References

    1. Coolen MJL, Orsi WD. The transcriptional response of microbial communities in thawing Alaskan permafrost soils. Front Microbiol. 2015;6:197. - PMC - PubMed
    1. Gonzalez E, Pitre FE, Pagé AP, et al.. Trees, fungi and bacteria: tripartite metatranscriptomics of a root microbiome responding to soil contamination. Microbiome. 2018;6:53. - PMC - PubMed
    1. Gosalbes MJ, Durbán A, Pignatelli M, et al.. Metatranscriptomic approach to analyze the functional human gut microbiota. PLoS One. 2011;6:e17447. - PMC - PubMed
    1. Abu-Ali GS, Mehta RS, Lloyd-Price J, et al.. Metatranscriptome of human faecal microbial communities in a cohort of adult men. Nat Microbiol. 2018;3:356. - PMC - PubMed
    1. Leimena MM, Ramiro-Garcia J, Davids M, et al.. A comprehensive metatranscriptome analysis pipeline and its validation using human small intestine microbiota datasets. BMC Genomics. 2013;14:530. - PMC - PubMed

Publication types

LinkOut - more resources