Quantifying Shared and Unique Gene Content across 17 Microbial Ecosystems
- PMID: 37022232
- PMCID: PMC10134805
- DOI: 10.1128/msystems.00118-23
Quantifying Shared and Unique Gene Content across 17 Microbial Ecosystems
Abstract
Measuring microbial diversity is traditionally based on microbe taxonomy. Here, in contrast, we aimed to quantify heterogeneity in microbial gene content across 14,183 metagenomic samples spanning 17 ecologies, including 6 human associated, 7 nonhuman host associated, and 4 in other nonhuman host environments. In total, we identified 117,629,181 nonredundant genes. The vast majority of genes (66%) occurred in only one sample (i.e., "singletons"). In contrast, we found 1,864 sequences present in every metagenome, but not necessarily every bacterial genome. Additionally, we report data sets of other ecology-associated genes (e.g., abundant in only gut ecosystems) and simultaneously demonstrated that prior microbiome gene catalogs are both incomplete and inaccurately cluster microbial genetic life (e.g., at gene sequence identities that are too restrictive). We provide our results and the sets of environmentally differentiating genes described above at http://www.microbial-genes.bio. IMPORTANCE The amount of shared genetic elements has not been quantified between the human microbiome and other host- and non-host-associated microbiomes. Here, we made a gene catalog of 17 different microbial ecosystems and compared them. We show that most species shared between environment and human gut microbiomes are pathogens and that prior gene catalogs described as "nearly complete" are far from it. Additionally, over two-thirds of all genes only appear in a single sample, and only 1,864 genes (0.001%) are found in all types of metagenomes. These results highlight the large diversity between metagenomes and reveal a new, rare class of genes, those found in every type of metagenome, but not every microbial genome.
Keywords: bioinformatics; human microbiome; metagenomics.
Conflict of interest statement
The authors declare a conflict of interest. Aleksandar D. Kostic is an advisor at FitBiomics. Chirag J. Patel is a cofounder of XY.ai. Braden T. Tierney consults for Seed Health on microbiome study design and analysis.
Figures








Similar articles
-
An Expanded Gene Catalog of Mouse Gut Metagenomes.mSphere. 2021 Feb 24;6(1):e01119-20. doi: 10.1128/mSphere.01119-20. mSphere. 2021. PMID: 33627510 Free PMC article.
-
TaxiBGC: a Taxonomy-Guided Approach for Profiling Experimentally Characterized Microbial Biosynthetic Gene Clusters and Secondary Metabolite Production Potential in Metagenomes.mSystems. 2022 Dec 20;7(6):e0092522. doi: 10.1128/msystems.00925-22. Epub 2022 Nov 15. mSystems. 2022. PMID: 36378489 Free PMC article.
-
The Landscape of Genetic Content in the Gut and Oral Human Microbiome.Cell Host Microbe. 2019 Aug 14;26(2):283-295.e8. doi: 10.1016/j.chom.2019.07.008. Cell Host Microbe. 2019. PMID: 31415755 Free PMC article. Review.
-
Towards a unified medical microbiome ecology of the OMU for metagenomes and the OTU for microbes.BMC Bioinformatics. 2024 Mar 29;25(1):137. doi: 10.1186/s12859-023-05591-8. BMC Bioinformatics. 2024. PMID: 38553666 Free PMC article.
-
Metagenome-scale community metabolic modelling for understanding the role of gut microbiota in human health.Comput Biol Med. 2022 Oct;149:105997. doi: 10.1016/j.compbiomed.2022.105997. Epub 2022 Aug 19. Comput Biol Med. 2022. PMID: 36055158 Review.
Cited by
-
Viral activation and ecological restructuring characterize a microbiome axis of spaceflight-associated immune activation.Res Sq [Preprint]. 2023 Oct 10:rs.3.rs-2493867. doi: 10.21203/rs.3.rs-2493867/v1. Res Sq. 2023. Update in: Nat Microbiol. 2024 Jul;9(7):1661-1675. doi: 10.1038/s41564-024-01635-8. PMID: 37886447 Free PMC article. Updated. Preprint.
-
Deep learning methods in metagenomics: a review.Microb Genom. 2024 Apr;10(4):001231. doi: 10.1099/mgen.0.001231. Microb Genom. 2024. PMID: 38630611 Free PMC article. Review.
References
-
- Knights D, Silverberg MS, Weersma RK, Gevers D, Dijkstra G, Huang H, Tyler AD, van Sommeren S, Imhann F, Stempak JM, Huang H, Vangay P, Al-Ghalith GA, Russell C, Sauk J, Knight J, Daly MJ, Huttenhower C, Xavier RJ. 2014. Complex host genetics influence the microbiome in inflammatory bowel disease. Genome Med 6:107. doi:10.1186/s13073-014-0107-1. - DOI - PMC - PubMed
-
- Le Goallec A, Tierney BT, Luber JM, Cofer EM, Kostic AD, Patel CJ. 2020. A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type. PLoS Comput Biol 16:e1007895. doi:10.1371/journal.pcbi.1007895. - DOI - PMC - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources