Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar 23;15(1):2621.
doi: 10.1038/s41467-024-46888-3.

Multi-omic integration of microbiome data for identifying disease-associated modules

Affiliations

Multi-omic integration of microbiome data for identifying disease-associated modules

Efrat Muller et al. Nat Commun. .

Abstract

Multi-omic studies of the human gut microbiome are crucial for understanding its role in disease across multiple functional layers. Nevertheless, integrating and analyzing such complex datasets poses significant challenges. Most notably, current analysis methods often yield extensive lists of disease-associated features (e.g., species, pathways, or metabolites), without capturing the multi-layered structure of the data. Here, we address this challenge by introducing "MintTea", an intermediate integration-based approach combining canonical correlation analysis extensions, consensus analysis, and an evaluation protocol. MintTea identifies "disease-associated multi-omic modules", comprising features from multiple omics that shift in concord and that collectively associate with the disease. Applied to diverse cohorts, MintTea captures modules with high predictive power, significant cross-omic correlations, and alignment with known microbiome-disease associations. For example, analyzing samples from a metabolic syndrome study, MintTea identifies a module with serum glutamate- and TCA cycle-related metabolites, along with bacterial species linked to insulin resistance. In another dataset, MintTea identifies a module associated with late-stage colorectal cancer, including Peptostreptococcus and Gemella species and fecal amino acids, in line with these species' metabolic activity and their coordinated gradual increase with cancer development. This work demonstrates the potential of advanced integration methods in generating systems-level, multifaceted hypotheses underlying microbiome-disease interactions.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. MintTea pipeline illustration and the multi-omic datasets analyzed in this study.
A An illustration of the MintTea pipeline, including data preprocessing, repeated module discovery using sparse generalized CCA (sGCCA), consensus analysis, and evaluation of each module’s association with the disease. The letters “H” and “D” represent the terms “healthy” and “disease”, respectively. See Methods. B Data types used in this analysis. Throughout the manuscript, taxonomic, pathway, fecal metabolite, and serum metabolite features are labeled with the letters ‘T’, ‘P’, ‘M’ and ‘S’, respectively. C The number of features available from each omic in each dataset after preprocessing. Sample sizes are noted under dataset names. CD crohn’s disease, CRC colorectal cancer, ESRD end-stage renal disease MS metabolic syndrome, T2D type-2 diabetes, UC ulcerative colitis.
Fig. 2
Fig. 2. Identifying disease-associated multi-omic modules via MintTea intermediate integration framework.
AC Properties of the multi-omic consensus modules obtained by applying MintTea to several key datasets, including A the number of features in each module, stratified by feature type, B cross-omic correlation per module, calculated as the average pairwise Spearman correlation between features of different omics (with gray points and lines indicating the average and standard deviation of correlations obtained from random modules of the same size), and C AUC per module, calculated using the first principal component values against disease label of each sample (with gray points and lines indicating the average and standard deviation of correlations obtained from random modules, as before). Modules that only included features from a single omic were discarded from our analyses but are listed in Supplementary Data Files S2 and S3. Modules that exhibit between-omic correlations (higher than random modules) and that are disease-associated (AUC > 0.7 and above that of random modules) are shown in a darker color. Circle colors under each dataset name indicate which omics were available for this dataset. The sample size of each dataset (i.e., the number of individuals profiled) are as shown in Fig. 1C. D A late-stage CRC-associated multi-omic module. Node colors represent feature types (see Fig. 1B). Edges connect features that appeared together in an sGCCA putative module in >80% of data subsampling iterations. Correlations between pairs of features within the module can be viewed in Supplementary Data File S4. E An MS-associated multi-omic module.
Fig. 3
Fig. 3. Multi-omic modules across datasets.
A Overlaps between multi-omic modules of different datasets. Each sector in the circus plot represents a consensus module, grouped by the datasets to which they belong. Black dots represent disease-associated multi-omic modules (as previously defined). Links between modules indicate overlaps of at least 2 features, with darker links indicating statistically significant dependences (Fisher’s exact test FDR < 0.1). B Multi-omic modules from multiple datasets that include B. uniformis, and additional overlapping features. For each module (from the dataset listed on top), all module features that appear in at least one other module are presented. Triangles pointing up indicate that the feature level was significantly increased in disease in that dataset, while triangles pointing down indicate an opposite trend (Mann–Whitney tests, FDR < 0.1). Circles indicate no significant difference between study groups. The module in dark orange (CRC, Feng) was also associated with disease state. STH soil-transmitted helminths, HT hypertension, ME-CFS myalgic encephalomyelitis/ chronic fatigue syndrome, IGT impaired glucose tolerance, SP super-pathway. Also see abbreviations in the legend of Fig. 1.

Update of

Similar articles

Cited by

References

    1. Zheng, D., Liwinski, T. & Elinav, E. Interaction between microbiota and immunity in health and disease. Cell Res. 30, 492–506 (2020). - PMC - PubMed
    1. Fan Y, Pedersen O. Gut microbiota in human metabolic health and disease. Nat. Rev. Microbiol. 2021;19:55–71. doi: 10.1038/s41579-020-0433-9. - DOI - PubMed
    1. Neish A. Microbes in gastrointestinal health and disease. Gasteroenterology. 2009;30:2008. - PMC - PubMed
    1. Sharon G, et al. Human gut microbiota from autism spectrum disorder promote behavioral symptoms in mice. Cell. 2019;177:1600–1618.e17. doi: 10.1016/j.cell.2019.05.004. - DOI - PMC - PubMed
    1. Mars RAT, et al. Longitudinal multi-omics reveals subset-specific mechanisms underlying irritable bowel syndrome. Cell. 2020;182:1460–1473.e17. doi: 10.1016/j.cell.2020.08.007. - DOI - PMC - PubMed