Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Jul 31;4(4):e315.
doi: 10.1002/mco2.315. eCollection 2023 Aug.

Applications of multi-omics analysis in human diseases

Affiliations
Review

Applications of multi-omics analysis in human diseases

Chongyang Chen et al. MedComm (2020). .

Abstract

Multi-omics usually refers to the crossover application of multiple high-throughput screening technologies represented by genomics, transcriptomics, single-cell transcriptomics, proteomics and metabolomics, spatial transcriptomics, and so on, which play a great role in promoting the study of human diseases. Most of the current reviews focus on describing the development of multi-omics technologies, data integration, and application to a particular disease; however, few of them provide a comprehensive and systematic introduction of multi-omics. This review outlines the existing technical categories of multi-omics, cautions for experimental design, focuses on the integrated analysis methods of multi-omics, especially the approach of machine learning and deep learning in multi-omics data integration and the corresponding tools, and the application of multi-omics in medical researches (e.g., cancer, neurodegenerative diseases, aging, and drug target discovery) as well as the corresponding open-source analysis tools and databases, and finally, discusses the challenges and future directions of multi-omics integration and application in precision medicine. With the development of high-throughput technologies and data integration algorithms, as important directions of multi-omics for future disease research, single-cell multi-omics and spatial multi-omics also provided a detailed introduction. This review will provide important guidance for researchers, especially who are just entering into multi-omics medical research.

Keywords: biomarker; machine learning and deep learning; multi‐omics; neurodegenerative disease; precision medicine.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

FIGURE 1
FIGURE 1
Multi‐omics approaches in disease research. Here lists the currently main available methods for each omics. RNA‐transcriptomics including next‐generation sequencing (NGS), single‐cell sequencing, and microRNA microarray. Proteins‐proteomics including 2D differential gel electrophoresis (2D‐DIGE), liquid chromatography‐mass spectrometry (LC‐MS), matrix‐assisted laser desorption ionization time‐of‐F (MALDI‐TOF‐MS), tandem mass tag (TMT), and selected reaction monitoring (SRM). DNA‐genomics and epigenomics including genome‐wide association studies (GWAS), whole‐exome sequencing (WES), next‐generation sequencing (NGS), and single‐cell sequencing. Images—radiomics including positron emission tomography (PET), magnetic resonance imaging (MRI), and computed tomography (CT). Metabolites—metabolomics including mass spectrometry and NMR spectroscopy. Microbe—microbiomics including 16S rRNA gene sequencing.
FIGURE 2
FIGURE 2
Notable recommendations for experimental design of multi‐omics. We list a lot of factors influencing the experimental results during experimental design, among them, three main factors include disease characteristics, choice of disease model, sample size, and phenotypic data. The disease characteristics contain pathogenic factors (bacteria or virus), genome instability and mutation, progressive disease (neurodegenerative disease), and sample type (tissue, cell, blood, or microbe). The disease model can be chosen according to suitable model type (human, mouse, or cell line), sample availability (unavailable samples, such as brain tissue and cerebrospinal fluid), and human‐omics databases (access to existing human disease omics data to compensate for uncommon disease models). The sample size and phenotypic data also need to be considered in multi‐omics experiment, such as the appropriate number of animal or human samples to achieve economically reliable omics results, while the appropriate number of detected samples usually according to confounding factors (batch effect, environmental stress like diet and smoking). The phenotypic data contain pathology, questionnaires, images, and so on.
FIGURE 3
FIGURE 3
The methods for multi‐omics data integration. Here simply shows the multi‐omics data (such as genomics, epigenetics, transcriptomics, proteomics, and metabolomics) integration method based on the correlation of each omics, molecular network construction at different levels, and machine learning. The ultimate goal of data integration is to discover disease biomarkers, confirm phenotypic spectrum and mechanisms, and identify therapeutic targets.
FIGURE 4
FIGURE 4
The brief process of integrating multi‐omics data with machine learning and deep learning. (A) The process of data integration by machine learning. The concatenation‐based integrated approach pipeline includes raw data from individual omics with corresponding phenotypic information, the data from the individual omics are then concatenated to form a single large matrix of multi‐omics data, and finally, supervised or unsupervised methods are used for joint matrix analysis. The model‐based integration method flow contains the establishment of the original data of various omics and the corresponding phenotypic information, develop individual models for each omics and then integrate them into a joint model, and finally, to analyze the joint model. And transformation‐based method starts with raw data of individual omics and corresponding phenotypic information, followed by developing individual transformations (in the form of graphs or kernel relations) for each omics, and then integrating it into joint transformations, and finally, analyzing it. The letters of PGPM are represented as phenotypic data (P), genomic data (G), proteomic data (P), and metabolomic data (M) in sequence. (B) The brief concept of data integration is achieved by deep learning. First, preprocess and clean the multi‐omics data, and then use conventional feature selection techniques or feature reduction methods for feature selection or dimensionality reduction to reduce the number of multi‐omics variables. Next, multiple omics variables are concatenated into one large data set for data integration. Finally, further feature selection or reduction techniques are applied to reduce the variables, and the integrated data are analyzed using classification, regression, and clustering.
FIGURE 5
FIGURE 5
The application of multi‐omics in disease, aging, and natural drug target identification. There lists some disease‐associated applications of multi‐omics in cancer (liver cancer, lung cancer, ovarian cancer, and malignant lymphocytic tumor) and neurodegeneration (AD, PD, and ALS). Other applications, such as aging and natural drug (natural compound obtained from the flowers, seeds, or rootstock of plants) target screening, are also shown in the figure. The brief process and methods for the application of multi‐omics in their research area are listed right. For biomarker and target research, multi‐omics data are analyzed by differential expression analysis, correlation analysis, and network construction, and then, machine learning was used to obtain diagnostic biomarkers or therapeutic targets. The procedure for screening natural drug target is usually based on the integration of label‐free proteomics and chemical proteomics, and then is used for machine learning to acquire possible functional target.
FIGURE 6
FIGURE 6
The integrated multi‐omics analysis on sleep deprivation. (A) The brief procedure and data for multi‐omics study in sleep deprivation. The blood sample obtained from 32 volunteers suffered sleep deprivation and recovery for plasma proteomics and metabolomics, PBMC transcriptomics analysis. After multi‐omics integration, the prominent biological pathway induced by sleep deprivation was immune disorders followed by metabolism disorders and neurodegenerative disease, and neutrophil degranulation was main account for the immune change. The detailed analysis showed that immune cell counts and inflammatory factor levels were elevated, and neutrophils and their mediated immune processes were highly coordinated with sleep states. The correlation analysis revealed that SOD1 and S100A8 were the most likely biomarkers of sleep deprivation‐induced immune changes. (B) The integration of proteomics and metabolomics, and the differentially expressed proteins and metabolites linked the pathway of arginine and proline metabolism, the TCA cycle. (C) The integration of proteomics and transcriptomics, and the shared top pathways enriched by differentially expressed genes and proteins are listed.
FIGURE 7
FIGURE 7
Challenges and opportunities for data integration of multi‐omics. The analysis challenges together with possible solutions are presented for three different levels, such as data collection, integrative analysis, and community distribution.

References

    1. Kreitmaier P, Katsoula G, Zeggini E. Insights from multi‐omics integration in complex disease primary tissues. Trends Genet. 2023;39(1):46‐58. - PubMed
    1. Nie C, Li Y, Li R, et al. Distinct biological ages of organs and systems identified from a multi‐omics study. Cell Rep. 2022;38(10):110459. - PubMed
    1. Joshi A, Rienks M, Theofilatos K, Mayr M. Systems biology in cardiovascular disease: a multiomics approach. Nat Rev Cardiol. 2021;18(5):313‐330. - PubMed
    1. Armand EJ, Li J, Xie F, Luo C, Mukamel EA. Single‐cell sequencing of brain cell transcriptomes and epigenomes. Neuron. 2021;109(1):11‐26. - PMC - PubMed
    1. Pelka K, Hofree M, Chen JH, et al. Spatially organized multicellular immune hubs in human colorectal cancer. Cell. 2021;184(18):4734‐4752. e4720. - PMC - PubMed