Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun;12(22):e2412775.
doi: 10.1002/advs.202412775. Epub 2025 Apr 2.

Integrative Multi-Omics and Routine Blood Analysis Using Deep Learning: Cost-Effective Early Prediction of Chronic Disease Risks

Affiliations

Integrative Multi-Omics and Routine Blood Analysis Using Deep Learning: Cost-Effective Early Prediction of Chronic Disease Risks

Zhibin Dong et al. Adv Sci (Weinh). 2025 Jun.

Abstract

Chronic noncommunicable diseases (NCDS) are often characterized by gradual onset and slow progression, but the difficulty in early prediction remains a substantial health challenge worldwide. This study aims to explore the interconnectedness of disease occurrence through multi-omics studies and validate it in large-scale electronic health records. In response, the research examined multi-omics data from 160 sub-healthy individuals at high altitude and then a deep learning model called Omicsformer is developed for detailed analysis and classification of routine blood samples. Omicsformer adeptly identified potential risks for nine diseases including cancer, cardiovascular conditions, and psychiatric conditions. Analysis of risk trajectories from 20 years of large clinical patients confirmed the validity of the group in preclinical risk assessment, revealing trends in increased disease risk at the time of onset. Additionally, a straightforward NCDs risk prediction system is developed, utilizing basic blood test results. This work highlights the role of multiomics analysis in the prediction of chronic disease risk, and the development and validation of predictive models based on blood routine results can help advance personalized medicine and reduce the cost of disease screening in the community.

Keywords: multi‐omics; pathway discovery; potential risk classification.

PubMed Disclaimer

Conflict of interest statement

All authors declare no competing interests.

Figures

Figure 1
Figure 1
Flowchart of the overall framework of our article. Initially, we conducted sequencing and hematological sampling on a cohort comprising 160 healthy individuals. Subsequently, employing our clustering methodology, we categorized the hematological data into three distinct risk groups. Utilizing multi‐omics datasets, we classified these three risk groups and identified pertinent biomarkers. Subsequently, employing a battery of biological assays, we analyzed the identified biomarkers in conjunction with hematological parameters, resulting in the formulation of comprehensive hematological panels. These panels were subsequently validated using extensive time‐series clinical data and culminated in the development of a user‐friendly web server interface.
Figure 2
Figure 2
Detailed classification of the transcriptome, proteome, and metabolome sequencing data. a) Distribution of RNA sequence types in transcriptomic data; b) Comprehensive metabolomic profiling of blood samples; c) Comprehensive metabolomic profiling of urine samples; d) Protein abundance distribution in proteomic data.
Figure 3
Figure 3
Use clinical blood routine and omics data to build a blood routine panel for healthy people. a) The probability density diagram of blood routine data of all populations and different cluster populations, total represents all populations, 1, 2, and 3 represents different cluster populations; b) Results of deep cluster analysis of clinical blood routine data; c) GSEA analysis results of different groups; d) Relative risks of blood routine panels corresponding to different diseases.
Figure 4
Figure 4
Comparison of classification results of multiple omics models and model loss convergence. a) The comparison between our proposed multi‐omics classification model and some mainstream multi‐omics models in ACC, F1, Purity. b) The convergence diagram of the model losses as the epoch changes. c) The comparison between the metrics of the model we proposed and the optimal metrics of other methods when the hyperparameter a changes within the range.
Figure 5
Figure 5
Identification and analysis of characteristic substances. a) Multi‐Omics data interaction network. b) Analysis of transcriptome differences between high and low risk groups. c) Analysis of metabolite differences between high and low risk groups. d) Analysis of protein differences between high and low risk groups.
Figure 6
Figure 6
Large‐scale clinical data validation results. a) Distribution of clinical diagnoses by category; b) Routine blood sample size before diagnosis of various diseases over time; c) Disease prediction accuracy trends pre‐giagnosis; d) Heat map of risk changes of 100 hypertensive patients in the first five years before diagnosis; e, Risk and morbidity follow‐up of 100 people for 5 years; f) Pre‐diagnosis risk group proportion changes across different diseases; g) Distribution of important blood routine characteristics in different diseases.

References

    1. C. O. WHO , Air Quality Guidelines for Europe 2020, 91.
    1. World Health Organization , World health statistics 2023: monitoring health for the SDGs, Sustainable Development Goals, World Health Organization, Geneva: 2023.
    1. Crosby D., Bhatia S., Brindle K. M., Coussens L. M., Dive C., Emberton M., Esener S., Fitzgerald R. C., Gambhir S. S., Kuhn P., Rebbeck T. R., Balasubramanian S., Science 2022, 375, eaay9040. - PubMed
    1. Hunter D. J., Reddy K. S., N. Engl. J. Med. 2013, 369, 1336. - PubMed
    1. Bhardwaj N., Wodajo B., Spano A., Neal S., Coustasse A., Health Care Manag. 2018, 37, 90. - PubMed

LinkOut - more resources