Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 31;8(4):e0004023.
doi: 10.1128/msystems.00040-23. Epub 2023 Jul 25.

A mixed-model approach for estimating drivers of microbiota community composition and differential taxonomic abundance

Affiliations

A mixed-model approach for estimating drivers of microbiota community composition and differential taxonomic abundance

Amy R Sweeny et al. mSystems. .

Abstract

Next-generation sequencing (NGS) and metabarcoding approaches are increasingly applied to wild animal populations, but there is a disconnect between the widely applied generalized linear mixed model (GLMM) approaches commonly used to study phenotypic variation and the statistical toolkit from community ecology typically applied to metabarcoding data. Here, we describe the suitability of a novel GLMM-based approach for analyzing the taxon-specific sequence read counts derived from standard metabarcoding data. This approach allows decomposition of the contribution of different drivers to variation in community composition (e.g., age, season, individual) via interaction terms in the model random-effects structure. We provide guidance to implementing this approach and show how these models can identify how responsible specific taxonomic groups are for the effects attributed to different drivers. We applied this approach to two cross-sectional data sets from the Soay sheep population of St. Kilda. GLMMs showed agreement with dissimilarity-based approaches highlighting the substantial contribution of age and minimal contribution of season to microbiota community compositions, and simultaneously estimated the contribution of other technical and biological factors. We further used model predictions to show that age effects were principally due to increases in taxa of the phylum Bacteroidetes and declines in taxa of the phylum Firmicutes. This approach offers a powerful means for understanding the influence of drivers of community structure derived from metabarcoding data. We discuss how our approach could be readily adapted to allow researchers to estimate contributions of additional factors such as host or microbe phylogeny to answer emerging questions surrounding the ecological and evolutionary roles of within-host communities. IMPORTANCE NGS and fecal metabarcoding methods have provided powerful opportunities to study the wild gut microbiome. A wealth of data is, therefore, amassing across wild systems, generating the need for analytical approaches that can appropriately investigate simultaneous factors at the host and environmental scale that determine the composition of these communities. Here, we describe a generalized linear mixed-effects model (GLMM) approach to analyze read count data from metabarcoding of the gut microbiota, allowing us to quantify the contributions of multiple host and environmental factors to within-host community structure. Our approach provides outputs that are familiar to a majority of field ecologists and can be run using any standard mixed-effects modeling packages. We illustrate this approach using two metabarcoding data sets from the Soay sheep population of St. Kilda investigating age and season effects as worked examples.

Keywords: 16S; Bayesian estimation; amplicon sequence variants; community composition; differential abundance; generalized linear mixed-effects model; metabarcoding; microbiota.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig 1
Fig 1
Overview of mixed-model approach to wild microbiota analysis. Data processing (A) generates amplicon sequence variant (ASV)–level abundances for each sample. These raw abundances are used as the response for generalized linear mixed-effects models with Poisson error families. In the example illustrated, data include sampling time points for a group of individuals taken during two seasons. Model syntax therefore specifies a fixed effect of age, and random effects for taxonomy (asv), sample id (host:season, h:s), individual differential abundance of ASVs (asv:h), differential abundance of ASVs across seasons (asv:s), and a residual variance at the row level (asv:h:s). GLMM output can be used to partition the variance explained by each random-effect term (B). These variance components can be interpreted as the relative contributions of both technical variation and host or environmental contributions to differential abundance as illustrated in (C). Created with BioRender.com.
Fig 2
Fig 2
Soay sheep gut microbiota beta diversity in adults and lambs from 2013 (A) and from April and August of 2016 (B). Principal coordinates analysis (PCoA) plots represent Bray–Curtis dissimilarity indicating clustering of samples by the group. Ellipsoids represent a 95% confidence interval surrounding each group.
Fig 3
Fig 3
Proportion of variance in bacterial read counts from different ASVs explained by GLMM component terms for two data sets. The 2013 data set (A) compared gut microbiota across two age classes from individuals sampled once at the same time point), while the 2016 data set (B) compared samples taken from the same individuals over two seasons.
Fig 4
Fig 4
Differential abundances across age classes (A and B) or season (C and D) for individual ASVs calculated from GLMMs with Poisson error families and taxonomic levels specified as ASV only. (A and C) represent all ASV-level effects. Violin plots represent the distribution of effect estimates, and size of the point represents the inverse variance of the estimate. Rectangles indicate the ASVs with the highest magnitude (positive or negative) differential abundances in forest plots (B and D). Forest plots represent point estimates and HPDI for the ASVs involved in the 50 (age class) or 10 (season) strongest increases and decreases of abundance.

References

    1. Alberdi A, Aizpurua O, Bohmann K, Zepeda-Mendoza ML, Gilbert MTP. 2016. Do vertebrate gut metagenomes confer rapid ecological adaptation? Trends Ecol Evol 31:689–699. doi: 10.1016/j.tree.2016.06.008 - DOI - PubMed
    1. Koskella B, Hall LJ, Metcalf CJE. 2017. The microbiome beyond the horizon of ecological and evolutionary theory. Nat Ecol Evol 1:1606–1615. doi: 10.1038/s41559-017-0340-2 - DOI - PubMed
    1. Sudo N, Chida Y, Aiba Y, Sonoda J, Oyama N, Yu X-N, Kubo C, Koga Y. 2004. Postnatal microbial colonization programs the hypothalamic-pituitary-adrenal system for stress response in mice. J Physiol 558:263–275. doi: 10.1113/jphysiol.2004.063388 - DOI - PMC - PubMed
    1. Cox-Foster DL, Conlan S, Holmes EC, Palacios G, Evans JD, Moran NA, Quan P-L, Briese T, Hornig M, Geiser DM, Martinson V, vanEngelsdorp D, Kalkstein AL, Drysdale A, Hui J, Zhai J, Cui L, Hutchison SK, Simons JF, Egholm M, Pettis JS, Lipkin WI. 2007. A metagenomic survey of microbes in honey bee colony collapse disorder. Science 318:283–287. doi: 10.1126/science.1146498 - DOI - PubMed
    1. Desbonnet L, Clarke G, Shanahan F, Dinan TG, Cryan JF. 2014. Microbiota is essential for social development in the mouse. Mol Psychiatry 19:146–148. doi: 10.1038/mp.2013.65 - DOI - PMC - PubMed

Publication types

LinkOut - more resources