Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 Dec 13:2023.12.12.571384.
doi: 10.1101/2023.12.12.571384.

Unfolding and De-confounding: Biologically meaningful causal inference from longitudinal multi-omic networks using METALICA

Affiliations

Unfolding and De-confounding: Biologically meaningful causal inference from longitudinal multi-omic networks using METALICA

Daniel Ruiz-Perez et al. bioRxiv. .

Update in

Abstract

A key challenge in the analysis of microbiome data is the integration of multi-omic datasets and the discovery of interactions between microbial taxa, their expressed genes, and the metabolites they consume and/or produce. In an effort to improve the state-of-the-art in inferring biologically meaningful multi-omic interactions, we sought to address some of the most fundamental issues in causal inference from longitudinal multi-omics microbiome data sets. We developed METALICA, a suite of tools and techniques that can infer interactions between microbiome entities. METALICA introduces novel unrolling and de-confounding techniques used to uncover multi-omic entities that are believed to act as confounders for some of the relationships that may be inferred using standard causal inferencing tools. The results lend support to predictions about biological models and processes by which microbial taxa interact with each other in a microbiome. The unrolling process helps to identify putative intermediaries (genes and/or metabolites) to explain the interactions between microbes; the de-confounding process identifies putative common causes that may lead to spurious relationships to be inferred. METALICA was applied to the networks inferred by existing causal discovery and network inference algorithms applied to a multi-omics data set resulting from a longitudinal study of IBD microbiomes. The most significant unrollings and de-confoundings were manually validated using the existing literature and databases.

Keywords: Causal inference; Longitudinal microbiome analysis; Multi-omic integration; Unfolding; de-confounding.

PubMed Disclaimer

Figures

FIG 1
FIG 1
Samples of the two-time-slice DBN networks for the four different multi-omic subsets produced by PALM. Self-edges are not displayed to avoid clutter. Networks were learned with a maximum number of parents of 3. The four networks show the nodes representing variables from each omics data source organized in two large circles, one representing the variables for the current time point (blue) and the other for the next time point (orange). Node shapes represent the omics data source of the variable. Taxa nodes are represented as filled circles, metabolites as filled squares, genes as filled diamonds, and clinical variables as filled triangles. Red (green) edges represent negative (positive resp.) regression coefficients. Edge width is proportional to the regression coefficient and edge opacity to the bootstrap score. Finally, node opacity is proportional to abundance. a) DBN learned with just taxa abundance (T). The dataset included abundance of 27 bacteria and a clinical variable indicating the week the sample was obtained and resulted in a network with 95 edges. b) DBN learned with taxa and metabolites (TM). A set of 19 metabolites were added to the previous dataset, and 164 edges were learned in this network. c) DBN learned with the taxa and genes dataset (TG). A set of 34 genes were added to the taxa dataset, and a network with 230 edges was learned. d) DBN learned with the 27 taxa, 34 genes, and 19 metabolites (TGM), resulting in a total of 311 edges.
FIG 2
FIG 2
Heatmap showing the proportion of edges unrolled by METALICA in the Crohn’s disease datasets for the networks obtained from PyCausal (TETRAD) as the alpha parameter varies using datasets with and without temporal alignment. Last column shows the overall bootstrap score.
FIG 3
FIG 3
Heatmap showing percentages of edges unrolled by METALICA in the Crohn’s disease datasets for all the methods averaged over all parameter choices. The last column shows the overall bootstrap score.
FIG 4
FIG 4
Biologically confirmed unrolling. The edge Eubacterium siraeumBacteroides thetaiotaomicron learned in GT (T) is unrolled into Eubacterium siraeum → uridine kinase → cytidine → Bacteroides thetaiotaomicron in GTGM.
FIG 5
FIG 5
Biologically confirmed unrolling. The edge Bacteroides stercorisBacteroides stercoris learned in GT (T) is unrolled into Bacteroides stercoris → uridine kinase → cytidine → Bacteroides stercoris in GTGM

Similar articles

References

    1. Riesenfeld CS, Schloss PD, Handelsman J. 2004. Metagenomics: genomic analysis of microbial communities. Annu Rev Genet 38:525–552. - PubMed
    1. Fernandez M, Aguiar-Pulido V, Riveros J, Huang W, Segal J, Zeng E, Campos M, Mathee K, Narasimhan G. 2016. Microbiome analysis: State of the art and future trends. Comput Methods for Next Gener Seq Data Anal p 401–424.
    1. Bashiardes S, Zilberman-Schapira G, Elinav E. 2016. Use of meta-transcriptomics in microbiome research. Bioinform Biol Insights 10:BBI–S34610. - PMC - PubMed
    1. Turnbaugh PJ, Gordon JI. 2008. An invitation to the marriage of metagenomics and metabolomics. Cell 134 (5):708–713. - PubMed
    1. Stebliankin V, Sazal M, Valdes C, Mathee K, Narasimhan G. 2022. A novel approach for combining the metagenome, metaresistome, metareplicome and causal inference to determine the microbes and their antibiotic resistance gene repertoire that contribute to dysbiosis. Microb Genom 8 (12):mgen000899. - PMC - PubMed

Publication types