Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2024 Oct 22;13(11):848.
doi: 10.3390/biology13110848.

Integrating Molecular Perspectives: Strategies for Comprehensive Multi-Omics Integrative Data Analysis and Machine Learning Applications in Transcriptomics, Proteomics, and Metabolomics

Affiliations
Review

Integrating Molecular Perspectives: Strategies for Comprehensive Multi-Omics Integrative Data Analysis and Machine Learning Applications in Transcriptomics, Proteomics, and Metabolomics

Pedro H Godoy Sanches et al. Biology (Basel). .

Abstract

With the advent of high-throughput technologies, the field of omics has made significant strides in characterizing biological systems at various levels of complexity. Transcriptomics, proteomics, and metabolomics are the three most widely used omics technologies, each providing unique insights into different layers of a biological system. However, analyzing each omics data set separately may not provide a comprehensive understanding of the subject under study. Therefore, integrating multi-omics data has become increasingly important in bioinformatics research. In this article, we review strategies for integrating transcriptomics, proteomics, and metabolomics data, including co-expression analysis, metabolite-gene networks, constraint-based models, pathway enrichment analysis, and interactome analysis. We discuss combined omics integration approaches, correlation-based strategies, and machine learning techniques that utilize one or more types of omics data. By presenting these methods, we aim to provide researchers with a better understanding of how to integrate omics data to gain a more comprehensive view of a biological system, facilitating the identification of complex patterns and interactions that might be missed by single-omics analyses.

Keywords: metabolomics; multi-omics; omics data; omics integration; proteomics; transcriptomics.

PubMed Disclaimer

Conflict of interest statement

The authors have not declared any conflicts of interest.

Figures

Figure 1
Figure 1
Strategies for integrating omics data. Methods are based on correlation-based approaches, which identify associations between different types of data; machine learning algorithms, which can predict outcomes and identify patterns across data sets; and combined individual approaches, which map the interactions and relationships between molecular components.
Figure 2
Figure 2
Scatter plot between the log fold change from a differential gene expression test (logFCt) and the log fold change of the protein levels (logFCp) in three scenarios: (A) high association between transcriptomics and proteomics data; (B,C) disagreement between the changes in gene expression and protein levels for some genes/proteins. The red dashed 45-degree line indicates the theoretical correspondence where changes in gene expression at the RNA-Seq level would be equally reflected at the protein level.
Figure 3
Figure 3
The pipeline illustrates the differences between supervised and unsupervised learning strategies applied to omics data. Legend: PCA: Principal Component Analysis; t-SNE: t-Distributed Stochastic Neighbor Embedding; UMAP: Uniform Manifold Approximation and Projection; ICA: Independent Component Analysis; SVM: Support Vector Machines; PLS-DA: Partial Least-Squares Discriminant Analysis; LASSO: Least Absolute Shrinkage and Selection Operator; RMSE: Root-Mean-Square Error; VIP: Variable Importance in Projection; ROC Curve: Receiver Operating Characteristic Curve.

Similar articles

Cited by

References

    1. Hasin Y., Seldin M., Lusis A. Multi-Omics Approaches to Disease. Genome Biol. 2017;18:83. doi: 10.1186/s13059-017-1215-1. - DOI - PMC - PubMed
    1. Hasanzad M., Sarhangi N., Ehsani Chimeh S., Ayati N., Afzali M., Khatami F., Nikfar S., Aghaei Meybodi H.R. Precision Medicine Journey through Omics Approach. J. Diabetes Metab. Disord. 2022;21:881–888. doi: 10.1007/s40200-021-00913-0. - DOI - PMC - PubMed
    1. Karczewski K.J., Snyder M.P. Integrative Omics for Health and Disease. Nat. Rev. Genet. 2018;19:299–310. doi: 10.1038/nrg.2018.4. - DOI - PMC - PubMed
    1. Picard M., Scott-Boyer M.P., Bodein A., Périn O., Droit A. Integration Strategies of Multi-Omics Data for Machine Learning Analysis. Comput. Struct. Biotechnol. J. 2021;19:3735–3746. doi: 10.1016/j.csbj.2021.06.030. - DOI - PMC - PubMed
    1. Rozanova S., Barkovits K., Nikolov M., Schmidt C., Urlaub H., Marcus K. Quantitative Mass Spectrometry-Based Proteomics: An Overview. Methods Mol. Biol. 2021;2228:85–116. doi: 10.1007/978-1-0716-1024-4_8. - DOI - PubMed

LinkOut - more resources