Exploring microbial functional biodiversity at the protein family level-From metagenomic sequence reads to annotated protein clusters
- PMID: 36959975
- PMCID: PMC10029925
- DOI: 10.3389/fbinf.2023.1157956
Exploring microbial functional biodiversity at the protein family level-From metagenomic sequence reads to annotated protein clusters
Abstract
Metagenomics has enabled accessing the genetic repertoire of natural microbial communities. Metagenome shotgun sequencing has become the method of choice for studying and classifying microorganisms from various environments. To this end, several methods have been developed to process and analyze the sequence data from raw reads to end-products such as predicted protein sequences or families. In this article, we provide a thorough review to simplify such processes and discuss the alternative methodologies that can be followed in order to explore biodiversity at the protein family level. We provide details for analysis tools and we comment on their scalability as well as their advantages and disadvantages. Finally, we report the available data repositories and recommend various approaches for protein family annotation related to phylogenetic distribution, structure prediction and metadata enrichment.
Keywords: biodiversity; cluster annotation; metagenomes; metatranscriptomes; microbial dark matter; protein clustering; protein families.
Copyright © 2023 Baltoumas, Karatzas, Paez-Espino, Venetsianou, Aplakidou, Oulas, Finn, Ovchinnikov, Pafilis, Kyrpides and Pavlopoulos.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures







Similar articles
-
Analysis and comparison of very large metagenomes with fast clustering and functional annotation.BMC Bioinformatics. 2009 Oct 28;10:359. doi: 10.1186/1471-2105-10-359. BMC Bioinformatics. 2009. PMID: 19863816 Free PMC article.
-
MTR: taxonomic annotation of short metagenomic reads using clustering at multiple taxonomic ranks.Bioinformatics. 2011 Jan 15;27(2):196-203. doi: 10.1093/bioinformatics/btq649. Epub 2010 Dec 1. Bioinformatics. 2011. PMID: 21127032 Free PMC article.
-
Beyond classification: gene-family phylogenies from shotgun metagenomic reads enable accurate community analysis.BMC Genomics. 2013 Jun 22;14:419. doi: 10.1186/1471-2164-14-419. BMC Genomics. 2013. PMID: 23799973 Free PMC article.
-
Targeted metagenomics: a high-resolution metagenomics approach for specific gene clusters in complex microbial communities.Environ Microbiol. 2012 Jan;14(1):13-22. doi: 10.1111/j.1462-2920.2011.02438.x. Epub 2011 Mar 1. Environ Microbiol. 2012. PMID: 21366818 Review.
-
Metagenomic sequencing-driven multidisciplinary approaches to shed light on the untapped microbial natural products.Drug Discov Today. 2022 Mar;27(3):730-742. doi: 10.1016/j.drudis.2021.11.008. Epub 2021 Nov 11. Drug Discov Today. 2022. PMID: 34775105 Review.
Cited by
-
Unraveling the functional dark matter through global metagenomics.Nature. 2023 Oct;622(7983):594-602. doi: 10.1038/s41586-023-06583-7. Epub 2023 Oct 11. Nature. 2023. PMID: 37821698 Free PMC article.
-
Visualizing metagenomic and metatranscriptomic data: A comprehensive review.Comput Struct Biotechnol J. 2024 May 3;23:2011-2033. doi: 10.1016/j.csbj.2024.04.060. eCollection 2024 Dec. Comput Struct Biotechnol J. 2024. PMID: 38765606 Free PMC article. Review.