. 2021 Jun 15;40(13):3053-3065.

doi: 10.1002/sim.8957. Epub 2021 Mar 26.

Pathway testing for longitudinal metabolomics

Mitra Ebrahimpoor¹, Pietro Spitali², Jelle J Goeman¹, Roula Tsonaka¹

Affiliations

¹ Medical Statistics, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands.
² Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands.

PMID: 33768548
PMCID: PMC8252476
DOI: 10.1002/sim.8957

Pathway testing for longitudinal metabolomics

Mitra Ebrahimpoor et al. Stat Med. 2021.

. 2021 Jun 15;40(13):3053-3065.

doi: 10.1002/sim.8957. Epub 2021 Mar 26.

Authors

Mitra Ebrahimpoor¹, Pietro Spitali², Jelle J Goeman¹, Roula Tsonaka¹

Affiliations

¹ Medical Statistics, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands.
² Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands.

PMID: 33768548
PMCID: PMC8252476
DOI: 10.1002/sim.8957

Abstract

We propose a top-down approach for pathway analysis of longitudinal metabolite data. We apply a score test based on a shared latent process mixed model which can identify pathways with differentially progressing metabolites. The strength of our approach is that it can handle unbalanced designs, deals with potential missing values in the longitudinal markers, and gives valid results even with small sample sizes. Contrary to bottom-up approaches, correlations between metabolites are explicitly modeled leveraging power gains. For large pathway sizes, a computationally efficient solution is proposed based on pseudo-likelihood methodology. We demonstrate the advantages of the proposed method in identification of differentially expressed pathways through simulation studies. Finally, longitudinal metabolite data from a mice experiment is analyzed to demonstrate our methodology.

Keywords: global test; joint latent process; longitudinal analysis; mixed model; pseudo likelihood.

PubMed Disclaimer

Figures

**FIGURE 1**
Pairwise fitting approach—A separate model is fitted on each pair of longitudinal outcomes (y_r, y_s) and the bivariate L(y_r, y_s)'s are optimized giving rise to the MLE ${\hat{Θ}}^{r, s}$ where ${\hat{Θ}}^{r, s} = (β_{r}, σ_{r}, β_{s}, σ_{s})$ . Parameters that are estimated from more than one pair, for example, $β$ are averaged, giving rise to $\overline{β}$ , ${\sum^{‾}}_{u}$ , ${\sum^{‾}}_{b}$ , and ${\sum^{‾}}_{ϵ}$ . These estimates are then used in Equation (8) to derive the Empirical Bayes estimates from step I based on the full joint model

**FIGURE 2**
Simulation results—Colors represent the SLaPMEG (Green), separate LMM (Red), and pairwise (Blue) methods. Correlation of the metabolites and subjects are both low for the plot on the left (A) and high for the plot on the right (B). All methods have type I error rates close to 5%. SLaPMEG, shared latent process mixed effects modeling within global test [Colour figure can be viewed at wileyonlinelibrary.com]

**FIGURE 3**
Simulation results—Colors represent the SLaPMEG (Green), separate LMM (Red), and pairwise (Blue) methods. The plots show a scenario with a small sample size (26) where only 1/3 of the metabolites are associated with the phenotype. Correlation of the metabolites and subjects are both low for the plot on the left (A) and high for the plot on the right (B). Comparing the two plots, it is clear that Sep. LMM has less power under the high‐correlation scenario. The SLaPMEG model has more power in general and especially for a small sample and/or effect sizes. A similar pattern holds for larger sample size (100) and pathways with more active genes (2/3). SLaPMEG, shared latent process mixed effects modeling within global test [Colour figure can be viewed at wileyonlinelibrary.com]

**FIGURE 4**
Simulation results—Colors represent the SLaPMEG (Blue) and pairwise (Green) methods. The bar with darker colors are the results of Globaltest using all random effects, the lighter colors are the results of Globaltest using only the random slope. These plots show a scenario with an small sample size (n = 26) where (A) 1/3 and (B) 2/3 of the Metabolites are associated with a time‐dependent effect. SLaPMEG method has sufficient power to detect differential progression and the pairwise method follows closely. A similar pattern holds for larger sample size (n = 100). SLaPMEG, shared latent process mixed effects modeling within global test [Colour figure can be viewed at wileyonlinelibrary.com]

**FIGURE 5**
Simulation results—Colors represent the SLaPMEG (Green) and separate LMM (Red) methods. The plot on the left (A) shows type I error, and the plot on the right (B) shows power for a scenario with a small sample size (26) and 1/3 of the Metabolites in each pathway are associated with the phenotype. For this plot the data are simulated according to separate LMM models. Type I error rate is controlled even under misspecification of the correlation structure. Power follows a similar pattern to that of jointly simulated data. SLaPMEG, shared latent process mixed effects modeling within global test [Colour figure can be viewed at wileyonlinelibrary.com]

**FIGURE 6**
DMD study results—A, Time course of two metabolites from *metabolism of polyamines* pathway with nine metabolites and B, estimated b_ki values (metabolite‐specific random‐effects) for the corresponding metabolites. Colors represent the WT (blue) and *mdx* (red) groups. The direction of effect is different for the two metabolites. Creatine values (Top) lie below the shared longitudinal trajectory in the WT group compared with *mdx*. The opposite is true for the ornithine (Bottom) values. Despite the heterogeneity of effects, SLaPMEG identified the differential expression of this pathway between two groups. DMD, Duchenne muscular dystrophy; SLaPMEG, shared latent process mixed effects modeling within global test; WT, wild type [Colour figure can be viewed at wileyonlinelibrary.com]

See this image and copyright information in PMC

References

1. Guo X, Qi H, Verfaillie CM, Pan W. Statistical significance analysis of longitudinal gene expression data. Bioinformatics. 2003;19(13):1628‐1635. 10.1093/bioinformatics/btg206. - DOI - PubMed
1. Storey JD, Xiao W, Leek JT, Tompkins RG, Davis RW. Significance analysis of time course microarray experiments. Proc Nat Acad Sci. 2005;102(36):12837‐12842. 10.1073/pnas.0504609102. - DOI - PMC - PubMed
1. Khatri P, Sirota M, Butte AJ. Ten years of pathway analysis: current approaches and outstanding challenges. PLoS Comput Biol. 2012;8(2):e1002375. 10.1371/journal.pcbi.1002375. - DOI - PMC - PubMed
1. Subramanian A, Tamayo P, Mootha VK, et al. Gene set enrichment analysis: a knowledge‐based approach for interpreting genome‐wide expression profiles. Proc Nat Acad Sci U S America. 2005;102:15545‐15550. 10.1073/pnas.0506580102. - DOI - PMC - PubMed
1. Wang L, Chen X, Wolfinger RD, Franklin JL, Coffey RJ, Zhang B. A unified mixed effects model for gene set analysis of time course microarray experiments. Stat Appl Genet Molecul Biol. 2009;8(1):1‐18. 10.2202/1544-6115.1484. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Pathway testing for longitudinal metabolomics

Affiliations

Pathway testing for longitudinal metabolomics

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources