Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb 9:12:e78530.
doi: 10.7554/eLife.78530.

Community diversity is associated with intra-species genetic diversity and gene loss in the human gut microbiome

Affiliations

Community diversity is associated with intra-species genetic diversity and gene loss in the human gut microbiome

Naïma Madi et al. Elife. .

Abstract

How the ecological process of community assembly interacts with intra-species diversity and evolutionary change is a longstanding question. Two contrasting hypotheses have been proposed: Diversity Begets Diversity (DBD), in which taxa tend to become more diverse in already diverse communities, and Ecological Controls (EC), in which higher community diversity impedes diversification. Previously, using 16S rRNA gene amplicon data across a range of microbiomes, we showed a generally positive relationship between taxa diversity and community diversity at higher taxonomic levels, consistent with the predictions of DBD (Madi et al., 2020). However, this positive 'diversity slope' plateaus at high levels of community diversity. Here we show that this general pattern holds at much finer genetic resolution, by analyzing intra-species strain and nucleotide variation in static and temporally sampled metagenomes from the human gut microbiome. Consistent with DBD, both intra-species polymorphism and strain number were positively correlated with community Shannon diversity. Shannon diversity is also predictive of increases in polymorphism over time scales up to ~4-6 months, after which the diversity slope flattens and becomes negative - consistent with DBD eventually giving way to EC. Finally, we show that higher community diversity predicts gene loss at a future time point. This observation is broadly consistent with the Black Queen Hypothesis, which posits that genes with functions provided by the community are less likely to be retained in a focal species' genome. Together, our results show that a mixture of DBD, EC, and Black Queen may operate simultaneously in the human gut microbiome, adding to a growing body of evidence that these eco-evolutionary processes are key drivers of biodiversity and ecosystem function.

Keywords: Black Queen; ecology; evolution; evolutionary biology; metagenomics; microbiome; none; population genetics.

PubMed Disclaimer

Conflict of interest statement

NM, DC, RW, BS, NG No competing interests declared

Figures

Figure 1.
Figure 1.. Diversity begets diversity (DBD) and ecological controls (EC) hypotheses illustrated.
Hypothetical microbial communities are illustrated as gray circles containing assemblages of microbial species, shown in different colors. 'DBD' means that the focal species is more likely to acquire diversity – through de novo mutation, invasion of a different strain of the same species, or a combination of both – in a community with high diversity. This is because new niches are created in a more diverse community. By contrast, 'EC' means that the focal species is more likely to acquire diversity through strain invasion or mutation in a community with low diversity. This is because niches remain unfilled in a low-diversity community, while niche space is saturated in a high-diversity community, impeding further diversification.
Figure 2.
Figure 2.. Positive association between community diversity and within-species polymorphism in cross-sectional Human Microbiome Project (HMP) samples.
(A) Scatter plots showing the relationship between community Shannon diversity and within-species polymorphism rate (estimated at synonymous sites) in the nine most prevalent species in HMP. (B) Scatter plots showing the relationship between species richness and within-species polymorphism rate in the nine most prevalent species in HMP. These are simple correlations to show the relationships in the raw data. Significant correlations are shown with red trendlines (Spearman correlation, p<0.05); non-significant trendlines are in gray. Results of generalized additive models (GAMs) predicting polymorphism rate in a focal species as a function of (C) Shannon diversity, (D) species richness estimated on all sequence data, and (E) species richness estimated on rarefied sequence data. GAMs are based on data from 69 bacterial species across 249 HMP stool donors. Adjusted R2 and Chi-square p-values corresponding to the predictor effect are displayed in each panel. Shaded areas show the 95% confidence interval of each model prediction. See Supplementary file 1a and Supplementary file 2 section 1 for detailed model outputs.
Figure 2—figure supplement 1.
Figure 2—figure supplement 1.. Results of generalized additive models predicting within-species polymorphism rate (at synonymous sites) as a function of community diversity at higher taxonomic levels (Human Microbiome Project [HMP] data).
(A1–E1) The predictor is Shannon diversity. (A2–E2) The predictor is richness. Adjusted R-squared (R2) and Chi-squared p-values corresponding to the predictor are displayed in each panel (gam.summary function from mgcv R package). Shaded areas show the 95% confidence interval of each model prediction. See Supplementary file 1c and Supplementary file 2 sections 2 and 3 for further details about model outputs.
Figure 2—figure supplement 2.
Figure 2—figure supplement 2.. Results of generalized additive models predicting within-species polymorphism rate (at nonsynonymous sites) in a focal species as a function of community diversity at higher taxonomic levels (Human Microbiome Project [HMP] data).
(A1–E1) The predictor is Shannon diversity. (A2–E2) The predictor is richness. Adjusted R-squared (R2) and Chi-squared p-values corresponding to the predictor are displayed in each panel (gam.summary function from mgcv R package). Shaded areas show the 95% confidence interval of each model prediction. See Supplementary file 1d and Supplementary file 2 sections 5 and 6 for further details about model outputs.
Figure 3.
Figure 3.. Associations between community diversity and strain number in cross-sectional Human Microbiome Project (HMP) samples.
(A) Scatter plots showing the relationship between Shannon diversity and the inferred number of strains within each of the nine most prevalent species in HMP. (B) Scatter plots showing the relationship between species richness and the inferred number of strains within each of the nine most prevalent species in HMP. Significant linear correlations are shown with red trendlines (Pearson correlation, p<0.05); non-significant trend lines are in gray. Results of generalized linear mixed models (GLMMs) predicting strain count in a focal species as a function of (C) Shannon diversity, (D) species richness estimated on all data, and (E) species richness estimated on rarefied sequence data. Diversity estimates (X-axis) are standardized to zero mean and unit variance in the models. The Y-axis shows the mean number of strains per focal species predicted by the GLMM. GLMMs are based on data from 184 bacterial species across 249 HMP stool donors. p-Values (likelihood ratio test) are displayed in each panel. Shaded areas show the 95% confidence interval of each model prediction. See Supplementary file 1e and Supplementary file 2 section 7 for detailed model outputs.
Figure 3—figure supplement 1.
Figure 3—figure supplement 1.. Results of generalized linear mixed models predicting strain count in a focal species as a function of community diversity at higher taxonomic levels (Human Microbiome Project [HMP] data).
Strain number in a focal species is positively correlated with Shannon (A1–E1) whereas its correlation with richness remains negative (A2–E2) through all taxonomic levels. The Y-axis is the predicted mean number of strains within a focal species. p-Values (drop1 function from R stats package, likelihood ratio test [LRT]). Shaded areas show the 95% confidence interval of each model prediction. See Supplementary file 1f and Supplementary file 2 section 9 for model details.
Figure 4.
Figure 4.. Positive association between community diversity and gene loss in Human Microbiome Project (HMP) time series.
(A) Scatter plots showing the relationship between Shannon diversity at time point 1 (tp1) and gene loss between tp1 and tp2 within each of the nine most prevalent species in HMP. (B) Scatter plots showing the relationship between species richness at tp1 and gene loss between tp1 and tp2 within each of the nine most prevalent species in HMP. Significant linear correlations are shown with red trendlines (Pearson correlation, p<0.05); non-significant trend lines are in gray. The Y-axis is plotted on a log10 scale for clarity. Results of generalized linear mixed models (GLMMs) predicting gene loss in a focal species as a function of (C) Shannon diversity, (D) species richness estimated on all data, and (E) species richness estimated on rarefied sequence data. p-Values (likelihood ratio test) are displayed in each panel. Shaded areas show the 95% confidence interval of each model prediction. The Y-axis is plotted on the link scale, which corresponds to log for negative binomial GLMMs with a count response. GLMMs are based on data from 54 bacterial species across 154 HMP stool donors sampled at more than one time point. See Supplementary file 1g and Supplementary file 2 section 10 for detailed model outputs.
Figure 5.
Figure 5.. Community diversity is associated with increases in focal species polymorphism over short time lags and net gene loss in dense gut microbiome time series.
(A) Results of a generalized additive model (GAM) predicting polymorphism change in a focal species as a function of the interaction between Shannon diversity at the first time point and the time lag (days) between two time points in data from Poyet et al. The response (Y-axis) was log-transformed in the Gaussian GAM. Results of generalized linear mixed models (GLMMs) predicting (B) number of genes lost and (C) number of genes gained between two time points in a focal species as a function of the interaction between Shannon diversity at the first time point and the time lag between the two time points. (D) Results of the GLMM predicting the number of genes gained in a focal species as a function of the interaction between rarefied species richness at the first time point and the time lag between the two time points. The illustrated time lags correspond to the first quartile (50 days), the median (130 days), and the third quartile (250 days). See Supplementary file 1h and i and Supplementary file 2 section 11 for detailed model outputs. These analyses are based on data from 15 bacterial species across four stool donors from Poyet et al. Only statistically significant relationships are plotted. Non-significant relationships are not shown: the GAM predicting polymorphism change as a function of rarefied richness (p>0.05) and the GLMM predicting the number of genes lost as a function of rarefied richness (p>0.05).
Figure 5—figure supplement 1.
Figure 5—figure supplement 1.. Results of a generalized additive model (GAM) predicting polymorphism change in a focal species as a function of the interaction between Shannon diversity at the first time point and the time lag (days) between two time points in the Poyet time series.
The response (Y-axis) was log-transformed in the Gaussian GAM. Several different time lags are shown to illustrate the inversion of the relationship around a lag time of 150 days. See Supplementary file 1h and Supplementary file 2 section 11 for further model details.

Update of

  • doi: 10.1101/2022.03.08.483496

Similar articles

Cited by

References

    1. Albalat R, Cañestro C. Evolution by gene loss. Nature Reviews. Genetics. 2016;17:379–391. doi: 10.1038/nrg.2016.39. - DOI - PubMed
    1. Bolker B. Mixedmodels-misc. swh:1:rev:3b8b732667b6a261a3384f5cade8e78b68230f3aSoftware Heritage. 2023 https://archive.softwareheritage.org/swh:1:dir:8e83cdf33e3ea4f240774111d...
    1. Brooks ME, Kristensen K, van Benthem KJ, Magnusson A, Berg CW, Nielsen A, Skaug HJ, Mächler M, Bolker BM. Modeling Zero-Inflated Count Data with GlmmTMB. bioRxiv. 2017 doi: 10.1101/132753. - DOI
    1. Calcagno V, Jarne P, Loreau M, Mouquet N, David P. Diversity spurs diversification in ecological communities. Nature Communications. 2017;8:15810. doi: 10.1038/ncomms15810. - DOI - PMC - PubMed
    1. Chaumeil PA, Mussig AJ, Hugenholtz P, Parks DH. GTDB-tk: a toolkit to classify genomes with the genome taxonomy database. Bioinformatics. 2019;36:1925–1927. doi: 10.1093/bioinformatics/btz848. - DOI - PMC - PubMed

Publication types

Substances