Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 Oct 2:2023.09.30.560294.
doi: 10.1101/2023.09.30.560294.

Gut Microbiome Wellness Index 2 for Enhanced Health Status Prediction from Gut Microbiome Taxonomic Profiles

Affiliations

Gut Microbiome Wellness Index 2 for Enhanced Health Status Prediction from Gut Microbiome Taxonomic Profiles

Daniel Chang et al. bioRxiv. .

Update in

Abstract

Recent advancements in human gut microbiome research have revealed its crucial role in shaping innovative predictive healthcare applications. We introduce Gut Microbiome Wellness Index 2 (GMWI2), an advanced iteration of our original GMWI prototype, designed as a robust, disease-agnostic health status indicator based on gut microbiome taxonomic profiles. Our analysis involved pooling existing 8069 stool shotgun metagenome data across a global demographic landscape to effectively capture biological signals linking gut taxonomies to health. GMWI2 achieves a cross-validation balanced accuracy of 80% in distinguishing healthy (no disease) from non-healthy (diseased) individuals and surpasses 90% accuracy for samples with higher confidence (i.e., outside the "reject option"). The enhanced classification accuracy of GMWI2 outperforms both the original GMWI model and traditional species-level α-diversity indices, suggesting a more reliable tool for differentiating between healthy and non-healthy phenotypes using gut microbiome data. Furthermore, by reevaluating and reinterpreting previously published data, GMWI2 provides fresh insights into the established understanding of how diet, antibiotic exposure, and fecal microbiota transplantation influence gut health. Looking ahead, GMWI2 represents a timely pivotal tool for evaluating health based on an individual's unique gut microbial composition, paving the way for the early screening of adverse gut health shifts. GMWI2 is offered as an open-source command-line tool, ensuring it is both accessible to and adaptable for researchers interested in the translational applications of human gut microbiome science.

PubMed Disclaimer

Conflict of interest statement

Competing Interests D.C., V.K.G., and J.S. disclose that a patent application was filed relating to the materials in this manuscript. All other authors declare no competing interests.

Figures

Figure 1.
Figure 1.. Pooled analysis of stool metagenomes across multiple health and disease conditions from a diverse global representation.
(a) A survey was conducted in PubMed and Google Scholar to search for published studies with publicly available human stool shotgun metagenome (gut microbiome) samples from healthy (disease-free) and non-healthy (diseased) individuals. The initial collection of stool metagenomes consisted of 12,957 samples from 73 independent studies. All samples (.fastq files) were downloaded and reprocessed uniformly using identical bioinformatics methods. After quality control of sequenced reads, taxonomic profiling was performed using MetaPhlAn3. Studies and metagenome samples were removed based on several exclusion criteria. Finally, a total of 8,069 samples (5,547 and 2,522 metagenomes from healthy and non-healthy individuals, respectively) from 54 studies ranging across healthy and 11 non-healthy phenotypes were assembled into a pooled metagenome dataset for downstream analyses. (b) Demographics of the pooled dataset of 8069 human stool metagenomes from 54 published studies. Subject demographics, as reported in the original studies, include age (n = 4442), sex (n = 5019), and country of origin (n = 7647).
Figure 2.
Figure 2.. Gut microbiome taxonomic profiles of healthy and non-healthy individuals inform a Lasso-penalized logistic regression classification model.
(a) Principal component analysis (PCA) of gut microbiome profiles reveals significant differences in the distribution of healthy (disease-free) (blue, n = 5547) and non-healthy (diseased) (red, n = 2522) groups (P < 0.05, PERMANOVA). Ellipses represent 95% confidence regions. The top 10 PC1 and PC2 loading vector magnitudes are shown. (b) Coefficient values for the Lasso-penalized logistic regression model. The model includes 49 taxa with positive coefficients, 3105 taxa with zero coefficients, and 46 taxa with negative coefficients.
Figure 3.
Figure 3.. Enhanced classification of healthy and non-healthy stool metagenomes using Gut Microbiome Wellness Index 2 (GMWI2).
(a) GMWI2 best stratifies healthy and non-healthy groups compared to GMWI and α-diversity indices (d, Cliff’s Delta effect size; P-values from the Mann-Whitney U test). Balanced accuracies on the training set are shown for GMWI2 and GMWI. (b) The healthy group (blue, far left) exhibits significantly higher GMWI2 scores than all 11 non-healthy phenotypes. (c) Bins of GMWI2 and GMWI scores (x-axis). The height of black and gray bars indicate metagenome sample counts in each GMWI2 and GMWI bin, respectively (y-axis, left). Points represent the proportion of samples in each GMWI2 or GMWI bin corresponding to actual healthy and non-healthy individuals (y-axis, right). (d) Increased magnitude cutoffs result in improved classification performance of GMWI2, showing increasing training set balanced accuracy (blue, y-axis, left) at the expense of decreasing retained samples (orange, y-axis, right). (e) Classification performances of GMWI and GMWI2 in distinguishing healthy and non-healthy groups. Balanced accuracies are depicted for both groups on the training set, leave-one-out cross-validation (CV), and 10-fold CV, using varying magnitude cutoffs (0, 0.5, 1.0) of GMWI and GMWI2 scores. Balanced accuracies are shown between the blue and pink bars, which represent healthy and non-healthy groups, respectively. For the 10-fold CV, repeated random sub-sampling was performed 10 times, and the average results are displayed.
Figure 4.
Figure 4.. GMWI2 demonstrates effective generalization across diverse study populations.
(a) Classification performance on each excluded study in inter-study validation (ISV) is displayed by gold points (y-axis, right). The studies on the x-axis are rank-ordered based on either classification performance for a single phenotype (healthy or non-healthy) or balanced accuracy in the case of both phenotypes. The stacked bars illustrate the number of healthy (blue) and non-healthy (pink) stool metagenome samples in each study (y-axis, left). (b) Receiver operating characteristic curves for classification performance in distinguishing healthy and non-healthy phenotypes on the training set, 10-fold CV, and ISV.
Figure 5.
Figure 5.. Reanalysis of existing longitudinal gut microbiome studies with GMWI2.
(a) Changes in GMWI2 in patients with irritable bowel syndrome observed six months (6-mo) after undergoing fecal microbiota transplantation. Only subjects experiencing symptom relief (“Effect” group) displayed a significant increase in GMWI2 (P = 0.039, one-sided Wilcoxon signed-rank test). n, number of FMT donor samples (17 total samples from two healthy donors) or FMT recipients. (b) GMWI2 scores for the dietary groups (EEN, Vegan, and Omnivore) at baseline and the first 5 to 6 days of dietary intervention. The EEN group showed significant changes in GMWI2, with values significantly decreased by day 2 and thereafter (P < 0.05, two-sided Wilcoxon signed-rank test). No significant change in GMWI2 was observed for the Omnivore and Vegan groups compared to baseline. (c) GMWI2, Shannon Index, and species richness before and after antibiotic intervention. Despite recovery in Shannon Index and species richness at day 42 and 180, respectively, GMWI2 remained significantly lower compared to day 0, suggesting incomplete gut microbiome recovery even after ~6 months (P < 0.05, two-sided Wilcoxon signed-rank test). (d) GMWI2 of microbial communities after 24-hour in vitro fecal fermentation of five different prebiotic oligosaccharides. The height of the bars represents the mean GMWI2 (experiment conducted in triplicates for each study group), and error bars indicate the standard deviation from the mean. Different small letters denote groups with significant differences in GMWI2 as determined by Tukey’s HSD test (P < 0.05). Control groups: NS0, no substrate addition at 0 h; NS24, no substrate for 24 h. Prebiotic groups: FS24, fructooligosaccharide; IN24, inulin; GS24, galactooligosaccharide; XS24, xylooligosaccharide; FL24, 2’-fucosyllactose.

References

    1. Schirmer M. et al. Linking the human gut microbiome to inflammatory cytokine production capacity. Cell 167, 1125–1136.e8 (2016). - PMC - PubMed
    1. Halfvarson J. et al. Dynamics of the human gut microbiome in inflammatory bowel disease. Nat. Microbiol. 2, 1–7 (2017). - PMC - PubMed
    1. Lloyd-Price J. et al. Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases. Nature 569, 655–662 (2019). - PMC - PubMed
    1. Wirbel J. et al. Meta-analysis of fecal metagenomes reveals global microbial signatures that are specific for colorectal cancer. Nat. Med. 25, 679–689 (2019). - PMC - PubMed
    1. Mars R. A. T. et al. Longitudinal multi-omics reveals subset-specific mechanisms underlying irritable bowel syndrome. Cell 183, 1137–1140 (2020). - PubMed

Publication types